datascientology | Образование

Telegram-канал datascientology - Data Scientology

1234

Hot data science related posts every hour. Chat: https://telegram.me/r_channels Contacts: @lgyanf

Подписаться на канал

Data Scientology

Realistically-sized 3D visualization of replacing the Washington Monument with Barad-dûr, Sauron's Dark Tower [OC]

/r/dataisbeautiful
https://redd.it/yv0cei

Читать полностью…

Data Scientology

[OC] Most valuable brands this millennia

/r/dataisbeautiful
https://redd.it/yuvx5t

Читать полностью…

Data Scientology

Research Monolith: Real Time Recommendation System With Collisionless Embedding Table

Building a scalable and real-time recommendation system is vital for many businesses driven by time-sensitive customer feedback, such as short-videos ranking or online ads.

Despite the ubiquitous adoption of production-scale deep learning frameworks like TensorFlow or PyTorch, these general-purpose frameworks fall short of business demands in recommendation scenarios for various reasons: on one hand, tweaking systems based on static parameters and dense computations for recommendation with dynamic and sparse features is detrimental to model quality; on the other hand, such frameworks are designed with batch-training stage and serving stage completely separated, preventing the model from interacting with customer feedback in real-time.

These issues led us to reexamine traditional approaches and explore radically different design choices. In this paper, we present Monolith, a system tailored for online training.

Our design has been driven by observations of our application workloads and production environment that reflects a marked departure from other recommendations systems.

Our contributions are manifold: first, we crafted a collisionless embedding table with optimizations such as expirable embeddings and frequency filtering to reduce its memory footprint; second, we provide an production-ready online training architecture with high fault-tolerance; finally, we proved that system reliability could be traded-off for real-time learning. Monolith has successfully landed in the BytePlus Recommend product.

Read more: https://arxiv.org/abs/2209.07663

/r/MachineLearning
https://redd.it/yuk9ga

Читать полностью…

Data Scientology

Trinidad and Tobago’s Ammonia Production from 2017 to 2021 averaged around 5 million metric tonnes, while exports averaged 4.25 million metric tonnes.

/r/visualization
https://redd.it/yu811u

Читать полностью…

Data Scientology

Do Data Scientists have to constantly study and upskill?

Hi there,

I'm currently working in a bit of a hybrid data anaylst/strategy role at my company and I'm interested in going back to school for my masters in data science. I recognize its a pretty glamorized field but I was wondering if the people in this thread can outline some of the cons around this field. I have the following questions below.

\- Is this field in demand? and what are usually the entry level requirements to break in?

\- Do Data scientists generally stay in this field long term or do they look to transition? If so, where do they transition to?

\- Is this a field where you constantly have to study and upskill? Mind you, I'm motivated and hard working but having to constantly study in my 40's/50's might be a bit difficult with a family and kids

\-Any other cons?

/r/datascience
https://redd.it/yukbcd

Читать полностью…

Data Scientology

[R] Unifying Diffusion Models' Latent Space, with Applications to CycleDiffusion and Guidance + Diffusers and Gradio Demo

/r/MachineLearning
https://redd.it/ytwygr

Читать полностью…

Data Scientology

Research Can we possibly get access to large language models (PaLM 540B, etc) like GPT-3 but no cost?

(I only want to do inference, I don't need to finetune it.)

I want to use very-large language model (#parameters > 100B) to do some experiments, is that true the only very-large language model we can get access to is GPT3 API? Can we possibly get access to PaLM and Flan-PaLM 540B with no cost by chance?

I have searched over the internet but can't find a definite answer. As GPT-3 pricing for text-davinci-2 is not cheap, I am wondering if there's a chance to use other models.

Also, I can request up to 372GB VRAM, is there any large language model (#parameters > 100B) that I can actually download and run "locally"?

/r/MachineLearning
https://redd.it/yu8nna

Читать полностью…

Data Scientology

What do you do while you wait for stuff to run?

I have a few things that take about 10-12 minutes to run and I often find myself distracted on social media. I'd like to be productive during those moments, what do you do?

/r/datascience
https://redd.it/yu4j0v

Читать полностью…

Data Scientology

Terrorist attacks in Europe that killed at least one person 1970-2015

/r/MapPorn
https://redd.it/yu6j6c

Читать полностью…

Data Scientology

Ministry of Defence of Ukraine: Losses of the Russian occupiers in Ukraine from Feb. 24–Nov. 13, 2022

/r/Infographics
https://redd.it/yu2enu

Читать полностью…

Data Scientology

Position of the north magnetic pole since 1590.

/r/MapPorn
https://redd.it/ytypfu

Читать полностью…

Data Scientology

Q S Best platform for personal statistics "blog"?

I'm a student in statistics.

I'd like to set-up a kind of "blog" website where I can document certain statistical concepts and demonstrations as I go. Something where I can include TeX and in-line code. Maybe even something where I can have some interactive visualizations (a la RShiny).

I'm thinking something along the lines of this website: https://statproofbook.github.io/ but I'd like to include a bit more personal exposition, case studies, interactive visualizations, etc.

Thoughts/suggestions on how to get started on something like this? I don't intend to become a full-fledged web-developer or anything. This is more for personal use. I don't necessarily intend for it to be an outward facing website or anything. . . just a place where I can organize my own ideas for future reference.

/r/statistics
https://redd.it/ysc1vu

Читать полностью…

Data Scientology

Education Seeking external help for a statistics course

Hi,

I've posted here before, and sorry if this isn't the right place for such a post - please feel free to direct me elsewhere if there are better places to go.

I just started an online master's program in statistics this fall, and I'm REALLY struggling in one class. In fact, I have an A in the class, but as of one or two weeks ago, I stopped understanding what was going on. I don't think the professor or TA can help me sufficiently given HOW lost I feel I am. It is like I hit a wall. I am thus thinking of trying an external resource, like a private tutor.

Can anyone here attest to whether and where there are good private tutors for GRADUATE level statistics courses? I'd like to find an in person tutor (Tampa Bay area), whereas it seems there are more online options. I AM NOT LOOKING FOR SOMEONE TO DO MY HOMEWORK. I am looking for someone who can read over and explain my notes for me AND HELP me with homework, but all to the point of my understanding it.

Thank you for any suggestions or advice.

Edit: I am open to online tutoring if that's all/the best I can find.

/r/statistics
https://redd.it/ystg13

Читать полностью…

Data Scientology

D When was the last time you wrote a custom neural net?

I work exclusively in NLP and since the transformers and especially their pretrained type took over, I haven't written a neural nets (RNN, LSTM, etc.) in over 3 years and haven't had to worry about things like # of layers, hidden size, etc.

Tabular data has XGBoost, etc.
NLP has Pretained Transformers.
Images have Pretrained CNNs, Transformers.

But I've been through some ML system design books and recommendation system solutions often display neural nets, so that's interesting.

What was the problem and type of data at hand when you last wrote a neural net yourself, layer by layer?

Thanks y'all!

/r/MachineLearning
https://redd.it/yto34q

Читать полностью…

Data Scientology

Q independant samples t tests (IST). why eveyone has a different opinion

in this page the students t-test is just another name for it. ( no mention to welches test)

in my courses there are 2 types of IST, students t-test and welches test

on this video the IST is not a part of the t tests anyway

i have lots of tabs open i could not wrap my head about this test and proper time of use

if someone could help thanks alot

/r/statistics
https://redd.it/ytgdj0

Читать полностью…

Data Scientology

Question If there's infinite universes and I pick one at random, is the probability that there's a me there zero or non-zero?



On the one hand it looks like it should be zero since I'm picking one value, but then I'm not really picking one value, but rather a range so perhaps it should be non-zero?

/r/statistics
https://redd.it/yu3h4n

Читать полностью…

Data Scientology

World’s Most Surveilled Cities

/r/MapPorn
https://redd.it/yuor6q

Читать полностью…

Data Scientology

[OC] Map of California Parks

/r/dataisbeautiful
https://redd.it/yul4jz

Читать полностью…

Data Scientology

Series of charts showing who Americans spend their time with over the course of their lives

https://redd.it/ytr9ax
@datascientology

Читать полностью…

Data Scientology

[OC] John Wick Series - Doubling the Budget Doubles the Gross

/r/dataisbeautiful
https://redd.it/yuk3lb

Читать полностью…

Data Scientology

Household Firearm Ownership Rate by U.S. State

/r/MapPorn
https://redd.it/yu26zu

Читать полностью…

Data Scientology

[OC] I bought and cooked 5.5 lb of chicken quarters. This is the breakdown of the weight throughout the process

/r/dataisbeautiful
https://redd.it/yub6ps

Читать полностью…

Data Scientology

D ML/AI role as a disabled person

I am about to finish my PhD in machine learning soon. Unfortunately, during my PhD, I became disabled and lost most of the function in my hands and some in my legs. I have been relying on voice-to-code software to do my work, but programming with it is not particularly easy or efficient.

I am looking for industry jobs right now, and was hoping to find a research role in ML which didn't involve heavy programming. Is this even possible for someone just entering the job market? I know the job market is quite bad right now, which is complicating matters a lot but I'd really appreciate any ideas for Canada/EU.

/r/MachineLearning
https://redd.it/yu5ch3

Читать полностью…

Data Scientology

How secure is YOUR phone? (All iPhone users!)

Hello everyone! I am doing a research project and need participants to take our quick 2-3 minute survey regarding their iPhone lockscreen security, and it will ask questions about your Siri and notification settings. Thanks to anyone who takes it!

https://iu.co1.qualtrics.com/jfe/form/SV\_0vMDa7eSph9TR1I

/r/SampleSize
https://redd.it/yts9mu

Читать полностью…

Data Scientology

Qatar has the world's highest gender ratio with 300 males per 100 females.
https://en.m.wikipedia.org/wiki/File:Qatar_single_age_population_pyramid_2020.png

/r/dataisbeautiful
https://redd.it/ytz8c6

Читать полностью…

Data Scientology

E Master’s in Applied Statistic recommended classes to take before (finance major)

Obviously calc 1-3 and matrix math i will be taking but what other courses should i take to help prepare for this program?

/r/statistics
https://redd.it/ysdii0

Читать полностью…

Data Scientology

Uk bus timetable route dataset needed

Does anyone know if uk bus route timetabled data is available publicly?

Need a new side project and thought of doing geospatial analysis of urban routes 5 years ago vs now.

I have a feeling this isn’t easily found due to fragmentation of bus ops outside of london.

Ta

/r/datasets
https://redd.it/ytbaut

Читать полностью…

Data Scientology

Grocery Product Dataset for USA and Australia

Hi Guys, I saw some folks looking for these types of dataset, here is a grocery product dataset for US and Australia , pls check it out and hopefully it helps!

Link to Dataset

/r/datasets
https://redd.it/ys4944

Читать полностью…

Data Scientology

A Not Shockingly Unrealistic U.S. High Speed Rail Plan

/r/MapPorn
https://redd.it/yt8k7d

Читать полностью…

Data Scientology

Comparison of annual births between Japan and South Korea, a race to the bottom [OC]

/r/dataisbeautiful
https://redd.it/ytdgq0

Читать полностью…
Подписаться на канал