datascientology | Образование

Telegram-канал datascientology - Data Scientology

1234

Hot data science related posts every hour. Chat: https://telegram.me/r_channels Contacts: @lgyanf

Подписаться на канал

Data Scientology

Pointers to write "clean" code for production?

I came across a recent post highlighting that one of the main points that bring others to hate DS people is their crappy code that they hand over to production, and it got me wondering:

What is a "clean" or "good" code that you expect a good DS professional to hand you for production so that you don't hate him/her for it? It'd be good to get some pointers to avoid these mistakes in the future and be hated for them.

And please don't reply "depends on the situation"; please give good generalisable advice.

/r/datascience
https://redd.it/yphayx

Читать полностью…

Data Scientology

[OC] Hong Kong has the highest proportion of female prisoners

/r/dataisbeautiful
https://redd.it/yphout

Читать полностью…

Data Scientology

Automating Apache Iceberg table maintenance on AWS
https://www.matano.dev/blog/2022/11/04/automated-iceberg-table-maintenance

/r/bigdata
https://redd.it/ym38tz

Читать полностью…

Data Scientology

[OC] Election Update: 2022 Mid-Term Ballots already cast by Seniors 65+ outweighs Young Voters (18-29) by 7 to 1. 95% of People Under 30 have not voted yet.

/r/dataisbeautiful
https://redd.it/yp5xqq

Читать полностью…

Data Scientology

Data Scientist / ML am I burning out?

Hi all,
this is a bit atypical in this sub, but I am really wondering how people are dealing with it. I started getting into machine learning because I was absolutely fascinated by some of its applications: prediction of stuff, image recognition, self driving, image generation... I mean there are tons of applications out there.

I managed to land a job where my time is split between building models for marketing like sales leads and churn models. After a few years I feel like my curiousity has been going down more and more.
I still enjoy coding, but I am not really excited anymore about the problem at hand. It always more of the same in slightly different clothes.
I realized that there is little that cannot be done with just XGBoost and ome common sense when defining your dataset. If that doesn't work it's probably not worth it my time anyway and it's time to move and and find another problem or another angle.
My main issue is that I don't feel like I am on auto pilot either. Each dataset has its own pecularity and you still need brain power to understand how is the data generated, what are the outliers, why are there outliers and the 1000 little things that can go wrong with your assumptions/code.

Should I start reading more papers? Do more toy projects? Go on a vacation? Close reddit for a bit?

/r/datascience
https://redd.it/yol32e

Читать полностью…

Data Scientology

[OC] The eurodollar futures curve, which is pointing to a US recession next year

/r/dataisbeautiful
https://redd.it/yoht4k

Читать полностью…

Data Scientology

[OC] First and last frost dates for Concord, NH for 80 years. The growing season is getting longer. Repost because the image got deleted somehow.

/r/dataisbeautiful
https://redd.it/yp4sra

Читать полностью…

Data Scientology

Poker Hand Rankings.

/r/Infographics
https://redd.it/yoo4d2

Читать полностью…

Data Scientology

[OC] Origins of the brazilian presidents with recent and/or easily trackable ancestry

/r/MapPorn
https://redd.it/yok6gm

Читать полностью…

Data Scientology

Did you know there is a part of Italy completely surrounded by Switzerland?

/r/MapPorn
https://redd.it/yojfcl

Читать полностью…

Data Scientology

Young voter turnout has never crossed 50% in US elections since 1968.
https://www.statista.com/statistics/1096299/voter-turnout-presidential-elections-by-age-historical/

/r/dataisbeautiful
https://redd.it/yoez1j

Читать полностью…

Data Scientology

The typical sleeping patterns of 40 different animals, highlighting their average sleep times and what percentage of each 24-hour day they spend resting

/r/Infographics
https://redd.it/yo2qmg

Читать полностью…

Data Scientology

Over the road truck driving since 2016 via the Fog Of World app. [OC]

/r/dataisbeautiful
https://redd.it/yo97x9

Читать полностью…

Data Scientology

Energy Poverty Indicators Database construction

Hello! Could you please let me know indicators that you think that would be useful to consider when characterising energy poverty?

I am currently trying to develop a robust dataset of indicators which can be used to analyse the economic poverty, energy consumption, buildings characteristics, health issues and other social consequences of energy poverty.

Let me know all your suggestions on what you think should be considered, specific indicators, dataset, website or projects which you think would be interesting for this topic.

/r/datasets
https://redd.it/yo6i21

Читать полностью…

Data Scientology

Data Science Hierarchy of Needs ... as relevant as ever

/r/datascience
https://redd.it/ynx8o8

Читать полностью…

Data Scientology

hot take: forget data science, we need more analysts

People are obsessed with pursuing data science roles for some reason. I guess it's interesting work with a high skill ceiling. Thats why I'm pursuing it. But nobody talks about the data analyst. The folks who write SQL for reporting, create dashboards, and provide insights. Data science does do all this in a more sophisticated way, but the reality is most tech companies or start ups do not even have an appetite for that kind of work since they are so focused on growth. If you're struggling to get into data science, consider analytics. The pay is still good (100k plus if you're doing product analytics) and a natural growth path from there can totally be data science. Don't rule it out, you have options. End 😊

/r/datascience
https://redd.it/ypr93q

Читать полностью…

Data Scientology

Pytorch Symbolic: an equivalent of Keras Functional API [Project]

Hello!

I just hit 1.0.0 version of a library I've been developing for the past months as a side project.

# Pytorch Symbolic

A library that aims to provide a concise API for neural network creation in PyTorch. The API and the inner workings are similar to [Keras / TensorFlow2 Functional API](https://www.tensorflow.org/guide/keras/functional) which I always enjoyed using. I decided to go with "Symbolic" in the name instead of "Functional" because I believe it better represents how the library works (also "functional" is kind of taken by `torch.nn.functional`).

I did my best to prepare a useful documentation, so if you are interested, please check it out!
It is filled with examples, best practices, benchmarks, explanations of the inner workings and more.

* [**See Documentation**](https://pytorch-symbolic.readthedocs.io/en/latest/quick_start)
* [See on GitHub](https://github.com/gahaalt/pytorch-symbolic/)
* [See on PyPI](https://pypi.org/project/pytorch-symbolic/)

## Example

This example shows how to create a multiple inputs neural network:
```
from torch import nn
from pytorch_symbolic import Input, SymbolicModel

input1 = Input(shape=(3, 32, 32))
input2 = Input(shape=(3, 32, 32))

output1 = nn.Conv2d(input1.C, 16, 3)(input1)
output2 = nn.Conv2d(input2.C, 16, 3)(input2)
final_output = output1 + output2

model = SymbolicModel(inputs=(input1, input2), outputs=final_output)
model.summary()
```
Keras-like summary is available as well:
```
___________________________________________________________
Layer Output shape Params Parent
===========================================================
1 Input_1 (None, 3, 32, 32) 0
2 Input_2 (None, 3, 32, 32) 0
3 Conv2d_1 (None, 16, 30, 30) 448 1
4 Conv2d_2 (None, 16, 30, 30) 448 2
5* AddOpLayer_1 (None, 16, 30, 30) 0 3,4
===========================================================
Total params: 896
Trainable params: 896
Non-trainable params: 0
___________________________________________________________
```

Pytorch Symbolic is actually more powerful than Keras Functional API, because it can be used to define graphs of operations over arbitrary Python objects. So if your `torch.nn.Module` operates on dictionaries, you can still use it in Pytorch Symbolic. This is not just a gimmick, as it is necessary to be compatible with all the Modules provided by `torch.nn` and Modules commonly used by the community.
You can read more in [Advanced Topics](https://pytorch-symbolic.readthedocs.io/en/latest/advanced_topics/) section of the documentation.

## Installation
Installation is easy with pip and there are no dependencies besides PyTorch (`torch>=1.12.0`):
```
pip install pytorch-symbolic
```

It is a small package, easy to install and uninstall, if you don't like it. :)

There's an introduction in form of Jupyter Notebook, if you prefer it.
[Go to GitHub](https://github.com/gahaalt/pytorch-symbolic/) to see it or run in Colab.

---

This library does not compete with the existing:
* FastAI
* PyTorch Lightning
* PyTorch Ignite

All of which are great libraries for data processing and neural network training. Pytorch Symbolic provides API solely for neural network creation and produces models entirely compatible with PyTorch, which means you can train the models created with Pytorch Symbolic using one of the above libraries!

Whether you try it or not, I am excited to hear your feedback on this library. Do you have any suggestions, questions, critique? Please share all your thoughts! :)

Contributions are welcomed too!

/r/MachineLearning
https://redd.it/ypkfwq

Читать полностью…

Data Scientology

[OC] A Comparison of Energy Sources according to Pollution, Death Rate, Price and Land Use

/r/dataisbeautiful
https://redd.it/yphovx

Читать полностью…

Data Scientology

[OC] Housing Cost Increase by State (1975 - 2022)

/r/dataisbeautiful
https://redd.it/yp9flt

Читать полностью…

Data Scientology

Seems a bit crazy, 400 applications within 3 days! Does this put anyone else off applying?

/r/datascience
https://redd.it/yp082p

Читать полностью…

Data Scientology

For the outsiders Yankee = American. In America it varies.

/r/MapPorn
https://redd.it/yp1g09

Читать полностью…

Data Scientology

Where the death penalty survives around the world.

/r/MapPorn
https://redd.it/yopymu

Читать полностью…

Data Scientology

India's Top export destination per continent [OC]

/r/dataisbeautiful
https://redd.it/yol0en

Читать полностью…

Data Scientology

Message frequency by daytime for some of my chats [OC]

/r/dataisbeautiful
https://redd.it/yoqe5l

Читать полностью…

Data Scientology

Repost The economic effects of the Brexit (UK citizens)

Hi! For my final high school paper, i am researching the effects of the Brexit on the UK, and i need your help!

To get a more complete picture of the effects of Brexit, we would like to ask you to fill in the survey below. This survey is completely anonymous and takes around 5 minutes to fill in, your answers will be deleted once our paper is finished.

The survey: https://forms.gle/9pYy3eB7bim7n2T36

/r/SampleSize
https://redd.it/yohsr7

Читать полностью…

Data Scientology

Looking for data on houseplant waste in supply chains

Does anyone know of a dataset on how many houseplant die in the supply chain, either in nurseries, on the way, or in plant stores? I found this report on Business insider but not solid data anywhere. I'm mostly interested about data for the European market but any data is useful. **https://www.businessinsider.com/houseplant-industry-americans-billions-die-2022-3?utm\_campaign=sf-bi-main&utm\_source=facebook.com&utm\_medium=social&fbclid=IwAR2tEdXO5y5ai2gw5Vu5aVB2x0kWs0OPmHg\_8Sljjhcc9dtQpSmpxp9bMGc**

/r/datasets
https://redd.it/ynqmnc

Читать полностью…

Data Scientology

If you're in the fortunate position to be picky about your next career move, please push back against the many bad DS recruitment practices. Don't hold back.

If you're told that the process will involve an unreasonably large number of interviews, tell them no.

If you're asked to do a 10 hour take-home assignment, tell them no.

If you're asked to do some brain-teaser questions and/or probability-esque calculations in a live setting, tell them no.

If they ghosted you for 4 weeks and then all of a sudden pretend to be interested in your candidacy, tell them no.

If they refuse to be upfront about salary, unwilling to provide even a reasonably sized range, tell them no.

I completely realize not everyone is in the lucky position to be picky. But if you are, use that to send a signal to recruiters that the practices they're using are very often completely ridiculous.

/r/datascience
https://redd.it/ynwpg0

Читать полностью…

Data Scientology

[P] Transcribe any podcast episode in just 1 minute with optimized OpenAI/whisper

/r/MachineLearning
https://redd.it/ynz4m1

Читать полностью…

Data Scientology

Highest grossing IP’s of all time

https://redd.it/yo9txr
@datascientology

Читать полностью…

Data Scientology

Every known Quantum Particle Poster

/r/Infographics
https://redd.it/ynyaqg

Читать полностью…
Подписаться на канал