Remove 2014 Remove Clustering Remove ETL
article thumbnail

Ask HN: Who wants to be hired? (July 2025)

Hacker News

I'm JD, a Software Engineer with experience touching many parts of the stack (frontend, backend, databases, data & ETL pipelines, you name it). I have about 3 YoE training PyTorch models on HPC clusters and 1 YoE optimizing PyTorch models, including with custom CUDA kernels. Email: hoglan (dot) jd (at) gmail Hello!

Python 69
article thumbnail

What is the Snowflake Data Cloud and How Much Does it Cost?

phData

If you go back to 2014, data warehouse platforms were built using legacy architectures that had drawbacks when it came to cost, scale, and flexibility. Data Processing: Snowflake can process large datasets and perform data transformations, making it suitable for ETL (Extract, Transform, Load) processes.

professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

7 Best Machine Learning Workflow and Pipeline Orchestration Tools 2024

DagsHub

The project was created in 2014 by Airbnb and has been developed by the Apache Software Foundation since 2016. Flexibility: Its use cases are wider than just machine learning; for example, we can use it to set up ETL pipelines. Cloud-agnostic and can run on any Kubernetes cluster.

article thumbnail

How to Manage Unstructured Data in AI and Machine Learning Projects

DagsHub

Apache Hadoop Apache Hadoop is an open-source framework that supports the distributed processing of large datasets across clusters of computers. is similar to the traditional Extract, Transform, Load (ETL) process. Kafka is highly scalable and ideal for high-throughput and low-latency data pipeline applications. Unstructured.io