Mario Cartia – Medium

Mario Cartia

How to migrate data from Postgres (or any RDBMS) to MongoDB (or any NoSQL) in denormalized form…

Migrating data from a relational database (RDBMS) to a NoSQL one is a very common task. One of the most common use cases in which you want…

8 min readOct 12, 2022

--

How to migrate data from Postgres (or any RDBMS) to MongoDB (or any NoSQL) in denormalized form…

--

Mario Cartia

The fastest way to get a Jupyter-based local development environment for Apache Spark 3 in Scala

As one of my main activities is training on Big Data topics, I often find myself having to set up local development environments for using…

3 min readJun 15, 2021

--

The fastest way to get a Jupyter-based local development environment for Apache Spark 3 in Scala

--

Mario Cartia
in
Agile Lab Engineering

Spark 3.0: First hands-on approach with Adaptive Query Execution (Part 3)

In the previous articles (1)(2), we started analyzing the individual features of Adaptive Query Execution introduced on Spark 3.0. In…

4 min readDec 7, 2020

--

Spark 3.0: First hands-on approach with Adaptive Query Execution (Part 3)

--

Mario Cartia
in
Agile Lab Engineering

Spark 3.0: First hands-on approach with Adaptive Query Execution (Part 2)

In the previous article, we started analyzing the individual features of Adaptive Query Execution introduced on Spark 3.0. In particular…

4 min readNov 5, 2020

--

2

Spark 3.0: First hands-on approach with Adaptive Query Execution (Part 2)

--

2

Mario Cartia
in
Agile Lab Engineering

Spark 3.0: First hands-on approach with Adaptive Query Execution (Part 1)

Apache Spark is a distributed data processing framework that is suitable for any Big Data context thanks to its features. Despite being a…

7 min readOct 14, 2020

--

1

Spark 3.0: First hands-on approach with Adaptive Query Execution (Part 1)

--

1

Mario Cartia
in
Agile Lab Engineering

How to create an Apache Spark 3.0 development cluster on a single machine using Docker

Apache Spark is the most widely used in-memory parallel distributed processing framework in the field of Big Data advanced analytics. The…

3 min readSep 23, 2020

--

2

How to create an Apache Spark 3.0 development cluster on a single machine using Docker

--

2

Mario Cartia

Utilizzo del formato PMML per esporre tramite REST API dei modelli di Machine Learning attraverso…

Nel post precedente ho illustrato in modo semplice come creare una REST API a partire da un modello di Machine Learning realizzato in…

3 min readOct 10, 2019

--

Utilizzo del formato PMML per esporre tramite REST API dei modelli di Machine Learning attraverso…

--

Mario Cartia

Come creare una REST API per il serving di un modello di Machine Learning Python con Google…

La combinazione Python + Jupyter è oggi quasi uno standard-de-facto per quanto riguarda lo sviluppo di modelli di Machine (o Deep)…

3 min readOct 7, 2019

--

1

Come creare una REST API per il serving di un modello di Machine Learning Python con Google…

--

1

Mario Cartia

Gestire files di piccole dimensioni su HDFS: analisi del problema e best practices

Hadoop è ad oggi la piattaforma Big Data standard-de-facto nel mondo enterprise. In particolare HDFS, il modulo Hadoop che implementa la…

4 min readMay 24, 2019

--

Gestire files di piccole dimensioni su HDFS: analisi del problema e best practices

--

Mario Cartia

Mario Cartia

Old school developer, veteran system administrator, technology lover and jazz piano player.

Following

Help
Status
About
Careers
Blog
Privacy
Terms
Text to speech
Teams