une plateforme pour les réunir toutes
Discussion on work-life balance and preventing burnout in the data field.
Chief Data Officer role-playing game.
General availability of dbt Projects on Snowflake, integration with Snowsight and CLI, performance improvements and extended support for dbt commands.
Faire moved its data analytics into an IDE with Cursor, enhancing SQL workflows and report automation. The team uses Snowflake and developed MCPs to connect various tools.
Demonstrates using FalkorDB and Neo4j to create a dbt knowledge graph and interact with it via AI-powered chat, optimizing complex dependency management.
DuckDB provides a serverless solution for SQL operations, integrating with pandas and Polars, optimizing memory and performance for data scientists.
Snowflake introduces interactive tables and warehouses in preview on AWS, enhancing query performance and real-time data processing capabilities.
Snowflake announces general availability of sharing semantic views, enabling providers to share them in private, public, and organizational listings.
Snowflake introduces pg_lake to integrate data lakehouses with Postgres, enabling federated SQL queries.
Deepnote is open-sourcing its successor to Jupyter notebooks, offering reactive, collaborative, and AI-ready features with a human- and AI-readable project format.
ClickHouse announces the acquisition of LibreChat, an open-source AI chat platform, enhancing agentic analytics capabilities with large-scale data processing.
Snowflake Intelligence leverages real-time text-to-SQL processing to enhance query efficiency.
Apache Polaris 1.2 enhances governance and connectivity, integrating better with Snowflake.
dltHub develops a Python-native data platform to accelerate data pipelines, combining simplicity with enterprise-grade governance.
Integration of Snowflake Cortex with Microsoft Teams to interact with data via a bot, using AI agents and semantic views.
nao offers a free version of its data IDE integrated with AI features, allowing to connect data warehouses and execute SQL.
BigQuery introduces a Data Engineering Agent in preview, automating complex tasks.
Technical comparison between Kafka and Postgres for real-time data processing, highlighting Kafka's advantages for event streaming.
Version 1.0 of pg_duckdb, a PostgreSQL extension embedding DuckDB's vectorized analytical engine, is announced, bringing performance improvements and better MotherDuck integration.
Four senior data engineers address Reddit's top questions on fundamentals, data quality, and tech choices.
Technical comparison between Airflow and Dagster for lakehouse orchestration, highlighting Dagster's improvements in smart partitioning, event-driven architecture, and data quality framework.
ClickHouse Cloud introduces parallel replicas, enabling GROUP BY queries on over 100 billion rows in under a second, without pre-aggregation or data reshuffling.
Lakehouses using open table formats like Apache Iceberg and Delta Lake are becoming viable for observability, pairing Parquet’s columnar compression and filtering with schema evolution.
Netflix uses ClickHouse to process 5 petabytes of logs daily, with optimizations like generated lexers and native serialization.
Explanation of SQLite's concurrency limitations and solutions implemented by Jellyfin, including optimistic and pessimistic locking strategies.
Introduction to dbc, a command-line tool that manages connections and executes SQL queries.
Comparison between Openflow, a GUI-based tool built on Apache NiFi for data ingestion in Snowflake, and dlt, an open-source Python library for creating custom data pipelines.
dltHub enhances scaffolds by combining deterministic parsing and LLM for more reliable data pipeline generation.
Migrating from stored procedures to dbt enhances trust, attracts talent, and prepares for AI. dbt provides better documentation, integrated testing, and transparent data lineage.
The era of open data infrastructure: dbt Labs and Fivetran merge to create an open data infrastructure based on Apache Iceberg, enabling better data utilization and enhanced governance.
dbt Labs announces state-aware orchestration for Fusion projects, reducing compute costs by 29%+ by avoiding unnecessary processing. This innovation uses intent-based configurations to optimize data freshness.
dbt Labs open sources MetricFlow, an independent schema for data interoperability, enhancing consistency and collaboration in data pipelines.
DuckDB can store and process videos by converting them into relational tables. The post demonstrates how to turn a movie into a table with 47 billion rows.
In-depth technical comparison between Apache Iceberg, Delta Lake, and Apache Hudi, highlighting their features, performance, and integrations.