Data Analysis With MongoDB Aggregation Pipeline to Enhance Software Development Services

Comments · 164 Views

Escalating demands of contemporary software development services call for efficient real-time data management to ensure scalable processing, flexible transformations, and integrated reporting for modern applications and microservices. MongoDB is a prominent data storage solution, well suit

Crafting Data Symphony

I hope you all are familiar with name MongoDB and its contribution to modern software development services. The avant-garde data canvas, helps orchestrating digital constellations in a schema-agnostic universe. It interweaves the threads of

NoSQL choreography, artfully embracing JSON syntax as its melodic score, while its document-oriented dance transcends the limitations of traditional relational databases. With graceful sharding and replication choreography, MongoDB twirls through the big data cosmos, curating a captivating performance of data storage and retrieval. 

Piecing Data Together

Aggregation is like assembling a puzzle, we piece together data fragments using stages to create a cohesive picture of meaningful information.

As far as software development is concerned we can assume a puzzle as a collection of database or unstructured documents or any other such pieces of blocks fed at the different stages of pipeline. All these pieces are combined together to create a meaningful picture. The different stagescom together and bring a coherent output document.

Let us take a deeper dive into this description to translate it for simple aggregation process:

In the above diagram, an operation is started from the collection of the database, which then runs through the number of wanted stages and finally we get the desired output.

Here No. of Stages is a piece of block. All pieces are combined with each other and we get the output picture.

MongoDB aggregation works the same as the above example.

In the MongoDB aggregation pipeline, pieces of blocks work as different stages like match, sort, lookup, unwind, etc… and all the stages are executed one by one and give the desired result.
In the MongoDB aggregation pipeline, we can perform N number of stages as depends on the desired output.

Solving A Real-Time Problem

Here is a Real-Time Problem Related to MongoDB Query. In simple queries, we can not retrieve data from embedded table fields. In that
Case we can use the MongoDB aggregation pipeline method.

Here Is a Real-Time Example Of This Case:

Example:

In the above figure, we have two documents in job position collections. I want data by Performing Search Operation on the input field.

Problem Statement:

In Simple MongoDB, we used the find() method for performing the Search operation. But in the use of find() i did not get another referenced collection data.In our example we have a job_title field. In the job titles table, I have 4-5 fields. But using the find() and populate() method we did not get all field data when Searching is Applied.

In that case, i used this type code:-

job_positions.find(find).populate(job_title).sort(sort).skip(skip).limit(limit)

But whenever a search is applied through the find object, that time I did not get the value of the job_title field.

That is why I used the MongoDB Aggregation Pipeline Method.

Problem Solutions:

There are multiple ways to carry out the Aggregation using MongoDB:

  1. Aggregation Pipeline
  2. Map-Reduce Function
  3. Single Purpose Aggregation Method

Here I used the Aggregation Pipeline Method for solving the above problem.

So Let’s Start……..

First I started with Pipeline then after I used mongoDB aggregate() function for executing the pipeline.

// Define the aggregation pipeline
  const pipeline = [
    // Join the "jobtitles" collection using the "job_title" field
    {
      $lookup: {
        from: 'jobtitles',
        localField: 'job_title',
        foreignField: '_id',
        as: 'job_title'
      }
    },

    {
      $match: {
        $and: andQuery
      }
    },

    // Sort the output documents
    {
      $sort: sort
    },

    //Code to set limit for output documents
    skipQuery,
    limitQuery,

    {
      $project: {
        _id: 0,
        job_position_id: '$_id',
        job_title: {
          job_title_id: '$job_title._id',
          job_title: 1,
          is_active: 1
        },
        job_position: 1,
        description: 1,
        created_date: 1,
        modified_date: 1,
        created_by: 1,
        modified_by: 1,
        is_active: 1
      }
    }
  ];

  // Execute the aggregation pipeline
  const results = await jobPositionModel.aggregate(pipeline);

  return results;

Finally, we got the desired Output when the search is applied.

A sample of desired output can be seen below:

As per the above image, When search is applied we can show the title content in the above table image.

Previously it was not showing when using normal find() and populate() queries in mongoDB.

 

Referance - https://www.zymr.com/blog/data-analysis-with-mongodb-aggregation-pipeline-to-enhance-software-development-services

 

Comments