Sat.Oct 07, 2023 - Fri.Oct 13, 2023

article thumbnail

7 High Paying Side Hustles for Data Scientists

KDnuggets

This article serves as a guide for the data professional who wants to earn more in these trying times.

article thumbnail

The insideBIGDATA IMPACT 50 List for Q4 2023

insideBIGDATA

The team here at insideBIGDATA is deeply entrenched in keeping the pulse of the big data ecosystem of companies from around the globe. We’re in close contact with the movers and shakers making waves in the technology areas of big data, data science, machine learning, AI and deep learning. Our in-box is filled each day with new announcements, commentaries, and insights about what’s driving the success of our industry so we’re in a unique position to publish our quarterly IMPACT 50 List.

Big Data 474
professionals

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

LLM Inference Performance Engineering: Best Practices

databricks

In this blog post, the MosaicML engineering team shares best practices for how to capitalize on popular open source large language models (LLMs).

399
399
article thumbnail

Exploring the Advanced Multi-Modal Generative AI

Analytics Vidhya

Introduction In today’s ever-advancing world of technology, there’s an exciting development on the horizon – Advanced Multi-modal Generative AI. This cutting-edge technology is about making computers more innovative and great, creating content and understanding. Imagine a digital assistant that seamlessly works with text, images, and sounds and generates information.

AI 353
article thumbnail

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Unlocking GPT-4 Summarization with Chain of Density Prompting

KDnuggets

Unlock the power of GPT-4 summarization with Chain of Density (CoD), a technique that attempts to balance information density for high-quality summaries.

article thumbnail

Keeping a Level Head during AI Implementation

insideBIGDATA

In this contributed article, Frank Laura, Chief Technology Officer at EngageSmart (NYSE: ESMT), discusses why CIOs and CTOs need to bring AI into businesses safely, securely, and legally. AI will enable CIOs and their teams to shift focus away from tactical and/or repetitive work towards creating innovative solutions for their teams and customers.

AI 435

More Trending

article thumbnail

How to Build LLM Apps Using Vector Database?

Analytics Vidhya

Introduction In the field of artificial intelligence, Large Language Models (LLMs) and Generative AI models such as OpenAI’s GPT-4, Anthropic’s Claude 2, Meta’s Llama, Falcon, Google’s Palm, etc., have revolutionized the way we solve problems. LLMs use deep learning techniques to perform natural language processing tasks. This article will teach you to build LLM Apps […] The post How to Build LLM Apps Using Vector Database?

Database 353
article thumbnail

Rust Burn Library for Deep Learning

KDnuggets

A new deep learning framework built entirely in Rust that aims to balance flexibility, performance, and ease of use for researchers, ML engineers, and developers.

article thumbnail

Microsoft’s Quest for the Next Killer App

insideBIGDATA

In this contributed article, Gordon McKenna, VP of Cloud Evangelist & Alliances at Ensono, discusses the situation with Microsoft strongly backing OpenAI, what can we expect the future to look like? Microsoft’s investment into OpenAI was a clear move for the company to align itself with the next killer app that would drive engagement on Azure cloud.

Azure 392
article thumbnail

Llama 2 Foundation Models Available in Databricks Lakehouse AI

databricks

We’re excited to announce that Meta AI’s Llama 2 foundation chat models are available in the Databricks Marketplace for you to fine-tune and dep.

AI 334
article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Leading With Data: Building a Data Driven Organization with Srikanth Velamakanni

Analytics Vidhya

Analytics Vidhya’s ‘Leading With Data’ is a series of interviews where industry leaders share their experiences, career journeys, interesting projects, and more. In the 5th episode of the series, we are joined by a very special guest – Mr. Srikanth Valamakanni. He is the Group CEO, Co-founder, and Vice-Chairman of Fractal Analytics, one of the […] The post Leading With Data: Building a Data Driven Organization with Srikanth Velamakanni appeared first on Analytics Vidhya.

Analytics 343
article thumbnail

Best Practices for Building ETLs for ML

KDnuggets

This article talks about several best practices for writing ETLs for building training datasets. It delves into several software engineering techniques and patterns applied to ML.

ETL 371
article thumbnail

How Can Platform Engineers Thrive in the Age of AI?

insideBIGDATA

In this contributed article, lead systems and DevOps engineer Manish Sharma discusses how platform engineering is a constantly developing field, and the advent of AI will likely accelerate the pace of change. Engineers can prepare for and adapt to the coming shifts by keeping up to date with developments in AI, including the increasing number of available AI tools and their applications for platform engineering.

AI 370
article thumbnail

Edge computing in 2023: International data science trends

Data Science Dojo

In today’s world, technology is evolving at a rapid pace. One of the advanced developments is edge computing. But what exactly is it? And why is it becoming so important? This article will explore edge computing and why it is considered the new frontier in international data science trends. Understanding edge computing Edge computing is a method where data processing happens closer to where it is generated rather than relying on a centralized data-processing warehouse.

article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

Unlocking Creativity with Advanced Transformers in Generative AI

Analytics Vidhya

Introduction In the ever-evolving landscape of artificial intelligence, one name has stood out prominently in recent years: transformers. These powerful models have transformed the way we approach generative tasks in AI, pushing the boundaries of what machines can create and imagine. In this article, we will delve into the advanced applications of transformers in generative […] The post Unlocking Creativity with Advanced Transformers in Generative AI appeared first on Analytics Vidhya.

article thumbnail

AI and Open Source Software: Separated at Birth?

KDnuggets

In this article, Luis shares with readers his thoughts on the intersection of open source software and machine learning and what the future might bring. Many articles cover how open source software is used by the machine learning community but this post focuses on the similarities between the two areas of practice and what machine learning can and can’t learn from open source software.

article thumbnail

Announcing public preview of Databricks Assets Bundles: Apply software development best practices with ease

databricks

We are delighted to announce that Databricks Asset Bundles are now in public preview. Bundles, for short, facilitate the adoption of software engineering.

297
297
article thumbnail

Data erasure – Data protection in the modern digital landscape

Data Science Dojo

Data erasure is a software-based process that involves data sanitization or in plain words ‘data wiping’ so that no traces of data remain recoverable. This helps with the prevention of data leakage and the protection of sensitive information like trade secrets, intellectual property, or customer information. By 2025, it is estimated that data will grow up to 175 Zettabytes, and with great data comes great responsibility.

Database 273
article thumbnail

Automation, Evolved: Your New Playbook for Smarter Knowledge Work

Speaker: Frank Taliano

Documents are the backbone of enterprise operations, but they are also a common source of inefficiency. From buried insights to manual handoffs, document-based workflows can quietly stall decision-making and drain resources. For large, complex organizations, legacy systems and siloed processes create friction that AI is uniquely positioned to resolve.

article thumbnail

OpenAI’s GPT-4V(ision): A Breakthrough in AI’s Multimodal Frontier

Analytics Vidhya

In a groundbreaking move reshaping the landscape of artificial intelligence, OpenAI has unveiled GPT-4 with vision, aptly named GPT-4V. This new iteration empowers users to harness the combined might of language and visual data. Thus unlocking unprecedented capabilities that promise to revolutionize our interactions with AI. Here, we delve into this latest advancement and explore […] The post OpenAI’s GPT-4V(ision): A Breakthrough in AI’s Multimodal Frontier appeared first on A

article thumbnail

Comparing Natural Language Processing Techniques: RNNs, Transformers, BERT

KDnuggets

RNN, Transformers, and BERT are popular NLP techniques with tradeoffs in sequence modeling, parallelization, and pre-training for downstream tasks.

article thumbnail

Databricks Obtains ISO 27701 Certification

databricks

We’re excited to announce that Databricks has obtained the International Standards Organization (ISO) 27701 certification as a data processor. This certification reflects our c.

287
287
article thumbnail

Reinforcement Learning: Balancing Exploration and Exploitation

insideBIGDATA

In this contributed article, Anthony Chong, CEO/Co-Founder of IKASI, discusses the three types of machine learning approaches, the benefits and requirements of each, and offer examples of how organizations are applying these tactics to address real world business challenges.

article thumbnail

The 2nd Generation of Innovation Management: A Survival Guide

Speaker: Chris Townsend, VP of Product Marketing, Wellspring

Over the past decade, companies have embraced innovation with enthusiasm—Chief Innovation Officers have been hired, and in-house incubators, accelerators, and co-creation labs have been launched. CEOs have spoken with passion about “making everyone an innovator” and the need “to disrupt our own business.” But after years of experimentation, senior leaders are asking: Is this still just an experiment, or are we in it for the long haul?

article thumbnail

A MLOps-Enhanced Customer Churn Prediction Project

Analytics Vidhya

Introduction When we hear data science, the first thing that comes to mind is building a model on notebooks and training the data. But this is not the situation in real-world data science. In the real world, data scientists build models and put them into production. The production environment has a gap between the development, […] The post A MLOps-Enhanced Customer Churn Prediction Project appeared first on Analytics Vidhya.

article thumbnail

Exploring Data Mesh: A Paradigm Shift in Data Architecture

KDnuggets

Let’s explore Data Mesh, a modern approach to data architecture that decentralizes data ownership and management.

article thumbnail

Scalable, In-House Quality Measurement with a NCQA-Certified Engine on the Lakehouse

databricks

This blog was written in collaboration with David Roberts (Analytics Engineering Manager), Kevin P. Buchan Jr (Assistant Vice President, Analytics), and Yubin Park.

Analytics 264
article thumbnail

AI Engineer Summit - Building Blocks for LLM Systems & Products

Eugene Yan

I give one talk a year and in 2023 this is that talk.

AI 264
article thumbnail

How to Achieve High-Accuracy Results When Using LLMs

Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage

When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m

article thumbnail

How to Learn Machine Learning Online?

Analytics Vidhya

Introduction Machine learning is a highly developing domain of technology at present. This technology allows computer systems to learn and make decisions without technical programming. It has a variety of applications, including recognizing patterns, data analysis, and improving performance over time. This guide on how to learn machine learning online will introduce you to the […] The post How to Learn Machine Learning Online?

article thumbnail

Revamping Data Visualization: Mastering Time-Based Resampling in Pandas

KDnuggets

Unlock the power of time-based data visualization with Pandas as we delve into the art of resampling, turning your data into insightful temporal masterpieces.

article thumbnail

Databricks and Shell collaborate to simplify industrial time series data analytics on the Lakehouse

databricks

Written in partnership with Shell. The energy industry is all about physical assets – from terminals, ships and pipelines to refineries and wind f.

Analytics 264
article thumbnail

Video Highlights: Make Better Decisions with Data — with Dr. Allen Downey

insideBIGDATA

In this video presentation, our good friend Jon Krohn, Co-Founder and Chief Data Scientist at the machine learning company Nebula, is joined by Dr. Allen Downey, renowned author and professor, who shares insights from his upcoming book 'Probably Overthinking It,' breaking down underused techniques like Survival Analysis, explaining common paradoxes, discussing the dynamic Overton Window, and how to be prepared for Black Swan events.

article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!