Mario Cartia – Medium

Mario Cartia

How to migrate data from Postgres (or any RDBMS) to MongoDB (or any NoSQL) in denormalized form…

Migrating data from a relational database (RDBMS) to a NoSQL one is a very common task. One of the most common use cases in which you want…

Oct 12, 2022

How to migrate data from Postgres (or any RDBMS) to MongoDB (or any NoSQL) in denormalized form…

Oct 12, 2022

The fastest way to get a Jupyter-based local development environment for Apache Spark 3 in Scala

As one of my main activities is training on Big Data topics, I often find myself having to set up local development environments for using…

Jun 15, 2021

The fastest way to get a Jupyter-based local development environment for Apache Spark 3 in Scala

Jun 15, 2021

Published in
Agile Lab Engineering

Spark 3.0: First hands-on approach with Adaptive Query Execution (Part 3)

In the previous articles (1)(2), we started analyzing the individual features of Adaptive Query Execution introduced on Spark 3.0. In…

Dec 7, 2020

Spark 3.0: First hands-on approach with Adaptive Query Execution (Part 3)

Dec 7, 2020

Published in
Agile Lab Engineering

Spark 3.0: First hands-on approach with Adaptive Query Execution (Part 2)

In the previous article, we started analyzing the individual features of Adaptive Query Execution introduced on Spark 3.0. In particular…

Nov 5, 2020

Spark 3.0: First hands-on approach with Adaptive Query Execution (Part 2)

Nov 5, 2020

Published in
Agile Lab Engineering

Spark 3.0: First hands-on approach with Adaptive Query Execution (Part 1)

Apache Spark is a distributed data processing framework that is suitable for any Big Data context thanks to its features. Despite being a…

Oct 14, 2020

Spark 3.0: First hands-on approach with Adaptive Query Execution (Part 1)

Oct 14, 2020

Published in
Agile Lab Engineering

How to create an Apache Spark 3.0 development cluster on a single machine using Docker

Apache Spark is the most widely used in-memory parallel distributed processing framework in the field of Big Data advanced analytics. The…

Sep 23, 2020

How to create an Apache Spark 3.0 development cluster on a single machine using Docker

Sep 23, 2020

Utilizzo del formato PMML per esporre tramite REST API dei modelli di Machine Learning attraverso…

Nel post precedente ho illustrato in modo semplice come creare una REST API a partire da un modello di Machine Learning realizzato in…

Oct 10, 2019

Utilizzo del formato PMML per esporre tramite REST API dei modelli di Machine Learning attraverso…

Oct 10, 2019

Come creare una REST API per il serving di un modello di Machine Learning Python con Google…

La combinazione Python + Jupyter è oggi quasi uno standard-de-facto per quanto riguarda lo sviluppo di modelli di Machine (o Deep)…

Oct 7, 2019

Come creare una REST API per il serving di un modello di Machine Learning Python con Google…

Oct 7, 2019

Gestire files di piccole dimensioni su HDFS: analisi del problema e best practices

Hadoop è ad oggi la piattaforma Big Data standard-de-facto nel mondo enterprise. In particolare HDFS, il modulo Hadoop che implementa la…

May 24, 2019

Gestire files di piccole dimensioni su HDFS: analisi del problema e best practices

May 24, 2019

Mario Cartia

Mario Cartia

Old school developer, veteran system administrator, technology lover and jazz piano player.

Following

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech