FIFA World Cup 2022 in Qatar saw many surprising results. Too many, compared with the previous tournaments. I'm running an experiment where I consistently bet on the least likely outcome and track how my fictional balance changes over time. This simple strategy would work fantastic in 2022 [OC]
/r/dataisbeautiful
https://redd.it/zcm4r0
JSON data of ingredient combinations into recipe outputs
Hi there,
I've done a search of the sub and looked at a few sources but none seem to quite fit what I am looking for.
I am looking for a dataset (preferrably JSON but I can convert it from others) which has:
1. Ingredients in inputs.
2. Recipes at output, for example:
{"scrambled egg":
{"ingredients":
{"name": "milk", "amount": 100, "units": "ml"},
{"name": "egg", "amount": 2, "units": "hole"}}}
Format is optional, my use case is I am building a cooking system for a non commercial game so need a dataset that can be machine understandable without having to use neural networks for language processing.
Thanks in advance, all
/r/datasets
https://redd.it/zckme1
Has Russia been at war with different European Countries?
/r/MapPorn
https://redd.it/zc6n87
[R][P] I made a Hugging Face gradio demo for text-to-3D paper Score Jacobian Chaining
/r/MachineLearning
https://redd.it/zbwwmc
2 Twitter Datasets for Finance-related Tweets Have Been Open-Sourced
Hi 👋,
Want to share two datasets I built for multi-class text classification. One dataset classifies finance related tweets for sentiment (bullish, bearish, neutral) and the other dataset classifies finance topic (20 topics) for tweets. They each hold an MIT license. Feel free to explore!
topic: https://huggingface.co/datasets/zeroshot/twitter-financial-news-topic
sentiment: https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment
/r/datasets
https://redd.it/zbhzcz
Hot take: Kaggle for entry level CVs is very mid-2010s. Here's what I'd do instead.
Kaggle can be fun, but don't do it because you think it'll land you a job---that strategy has peaked and the noise is too high. People don't want to know you can apply some canned ML to a canned problem, and the frontiers of ML research is deep into AI at this point, to the point where it's just straight up a different career. What'd I suggest instead is practice asking questions and finding answers, which for this purpose should be as eye catching as possible.
Download some city data and make a hilariously detailed plan for how to get good parking. Good can mean the cheapest or you can really have fun and try to optimize getting free parking at the risk of getting fined. Really learn about the domain, like be able to explain why it looks different on weekends because they allow alternate side parking or something. Bonus points for driving to the city and trying it out for real. Explain why your model's oversimplified.
This is just an example. IMHO it gets more to the heart of what data science really is today.
/r/datascience
https://redd.it/zby4e4
Languages of Britain and Ireland, 400 AD - 1900 AD.
/r/MapPorn
https://redd.it/zbph8v
Were there too many sensations in the group-stage games at the FIFA World Cup in Qatar? What if I bet the same amount against the odds in all the group-stage games? WOW – I would have finished the experiment with a net profit of almost 48 coins! [OC]
/r/dataisbeautiful
https://redd.it/zbfu8j
Top 10 Highest and Lowest crime rate countries
/r/Infographics
https://redd.it/zbh723
The Public's Perceptions on the Metaverse (All Welcome)
Hi! Firstly, this is an alt account, sorry about that, just felt it'd be better to focus my research stuff on a separate account. I'm an integrated masters student on their second dissertation!
Anyway, I'm currently conducting a survey on metaverse perceptions, asking the general public what they think about the concept and specific aspects of its design. Whether you love or loathe the idea, or have no idea what a metaverse is, I'd love to hear what you think in this survey, should take no more than 20 minutes depending on how much you have to say. Thank you to anyone who responds in advance, and any questions, feel free to reply and let me know, cheers!
https://bathpsychology.eu.qualtrics.com/jfe/form/SV\_3CP6skhtcG8cCtU
/r/SampleSize
https://redd.it/zbg4zc
Looking for yearly lime price data for limes in mexico
I am currently doing a project for my master's thesis that requires yearly data of lime prices in Mexico from at least 1990 to 2017. I am looking to merge this with an already existing data set on crime in various Mexican municipalities. Any data set you can provide me that would aid in my project would be super helpful, thanks!
/r/datasets
https://redd.it/za694y
This is the farthest place on earth from any ocean
/r/MapPorn
https://redd.it/zclgxn
Europe/North Africa overlaid on US/Canada by latitude
/r/MapPorn
https://redd.it/zc9ry9
Mapped: Global Energy Prices, by Country in 2022
/r/visualization
https://redd.it/zc3cch
help utilizing the healthcare price transparency data from insurers (that have been published since July 1)
I've been playing around with the data the last couple of weeks and wondering if anyone here has had any success. The problems I'm facing are
1. The files are insanely large. Eventually the only way I've been able to open them is using the DADROIT large JSON viewer: http://dadroit.com (but even this only worked when I got an M1 Mac)
2. There's a ton of them and I have to trawl through hundreds of files which take ages to load, in order to get to one that contains the data that I need (I have a list of providers by NPI or TIN, and I want to find all the plans that have contracts with those doctors or provider groups, and what the negotiated rates are)
Has anyone had success or found any tools that allow you to browse the files and search them without having to do stuff locally?
​
Looking at files like these for Aetna: https://health1.aetna.com/app/public/#/one/insurerCode=AETNACVS\_I&brandCode=ALICFI/machine-readable-transparency-in-coverage and these for United: https://transparency-in-coverage.uhc.com/?\_gl=1\*1arv6qk\*\_ga\*NjE5NTQzMTM2LjE2NzAwODEwMDY.\*\_ga\_HZQWR2GYM4\*MTY3MDA4MTAwNy4xLjEuMTY3MDA4MTcwMy4wLjAuMA..
​
From what I can tell the files are pretty consistent so its going to be relatively straightforward to do the work. It just takes a gigaton of time to find something that can do it.
/r/datasets
https://redd.it/zbizft
Critique My Resume- Don’t Hold Back- Thank You
/r/datascience
https://redd.it/zbr91d
Provinces and Territories of Canada as European countries of similar population
/r/MapPorn
https://redd.it/zbydt3
[OC] Most Medals Won at the World Cup Football
/r/dataisbeautiful
https://redd.it/zbimgy
[OC] Mexico has been in a water crisis for at least 20 years
/r/dataisbeautiful
https://redd.it/zbso2i
Topography of Scotland. This map shows 10m contour lines for Scotland [OC]
/r/dataisbeautiful
https://redd.it/zbj9f5