2014, Clustering and ETL - Data Science Current

2014

Clustering

ETL

Ask HN: Who wants to be hired? (July 2025)

Hacker News

JULY 1, 2025

I'm JD, a Software Engineer with experience touching many parts of the stack (frontend, backend, databases, data & ETL pipelines, you name it). I have about 3 YoE training PyTorch models on HPC clusters and 1 YoE optimizing PyTorch models, including with custom CUDA kernels. Email: hoglan (dot) jd (at) gmail Hello!

Python

Python AWS SQL ML

What is the Snowflake Data Cloud and How Much Does it Cost?

phData

NOVEMBER 9, 2023

If you go back to 2014, data warehouse platforms were built using legacy architectures that had drawbacks when it came to cost, scale, and flexibility. Data Processing: Snowflake can process large datasets and perform data transformations, making it suitable for ETL (Extract, Transform, Load) processes.

Data Warehouse

Data Warehouse Data Lakes Clustering Cloud Data

Join 17,000+

professionals

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Precision in Motion: Why Process Optimization Is the Future of Manufacturing

Airflow Best Practices for ETL/ELT Pipelines

MORE WEBINARS

Trending Sources

7 Best Machine Learning Workflow and Pipeline Orchestration Tools 2024

DagsHub

APRIL 7, 2024

The project was created in 2014 by Airbnb and has been developed by the Apache Software Foundation since 2016. Flexibility: Its use cases are wider than just machine learning; for example, we can use it to set up ETL pipelines. Cloud-agnostic and can run on any Kubernetes cluster.

Machine Learning

Machine Learning Machine Learning ML ML

Webinars

Precision in Motion: Why Process Optimization Is the Future of Manufacturing

Airflow Best Practices for ETL/ELT Pipelines

MORE WEBINARS

How to Manage Unstructured Data in AI and Machine Learning Projects

DagsHub

OCTOBER 23, 2024

Apache Hadoop Apache Hadoop is an open-source framework that supports the distributed processing of large datasets across clusters of computers. is similar to the traditional Extract, Transform, Load (ETL) process. Kafka is highly scalable and ideal for high-throughput and low-latency data pipeline applications. Unstructured.io

Machine Learning

Machine Learning Machine Learning Data Lakes AI

Ask HN: Who wants to be hired? (July 2025)

What is the Snowflake Data Cloud and How Much Does it Cost?

Webinars

Trending Sources

7 Best Machine Learning Workflow and Pipeline Orchestration Tools 2024

Webinars

How to Manage Unstructured Data in AI and Machine Learning Projects

Stay Connected