Remove 2014 Remove Database Remove ETL
article thumbnail

Big Data – Lambda or Kappa Architecture?

Data Science Blog

Kappa – Architecture Jay Kreps introduced the Kappa architecture in 2014 as an alternative to the Lambda architecture. For existing event sources, listeners are utilized to stream writes directly from database logs or similar data stores. It offers the advantage of having a single ETL platform to develop and maintain.

Big Data 130
article thumbnail

The Full Stack Data Scientist Part 6: Automation with Airflow

Applied Data Science

To keep myself sane, I use Airflow to automate tasks with simple, reusable pieces of code for frequently repeated elements of projects, for example: Web scraping ETL Database management Feature building and data validation And much more! What’s Airflow, and why’s it so good? What makes it my go to?

professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

How to Manage Unstructured Data in AI and Machine Learning Projects

DagsHub

Data can come from different sources, such as databases or directly from users, with additional sources, including platforms like GitHub, Notion, or S3 buckets. Vector Databases Vector databases help store unstructured data by storing the actual data and its vector representation. mp4,webm, etc.), and audio files (.wav,mp3,acc,

article thumbnail

What is the Snowflake Data Cloud and How Much Does it Cost?

phData

As an example, an IT team could easily take the knowledge of database deployment from on-premises and deploy the same solution in the cloud on an always-running virtual machine. If you go back to 2014, data warehouse platforms were built using legacy architectures that had drawbacks when it came to cost, scale, and flexibility.

article thumbnail

How to Use Exploratory Notebooks [Best Practices]

The MLOps Blog

In 2014, Project Jupyter evolved from IPython. Reading & executing from.sql scripts We can use.sql files that are opened and executed from the notebook through a database connector library. connection_params: A dictionary containing PostgreSQL connection parameters, such as 'host', 'port', 'database', 'user', and 'password'.

SQL 52
article thumbnail

Ask HN: Who wants to be hired? (July 2025)

Hacker News

I'm JD, a Software Engineer with experience touching many parts of the stack (frontend, backend, databases, data & ETL pipelines, you name it). With over 3 years of working with ETL pipelines and REST API integrations and development, I understand how to develop and maintain robust and scalable data systems. Let’s talk.

Python 65