GINQ for the win

Using Groovy 4: GINQ for the Win

Rik Scarborough Development Technologies, Groovy, Java, Programming 4 Comments

In my last blog post Back in the Groovy 4, I briefly mentioned Groovy-Integrated Query (GINQ). I’ve been wanting to write about how I would use this new feature, and I decided to take this opportunity to do so.

In this post, I will be describing two examples in which I used GINQ. The first requirement I faced on a recent project of mine and demonstrating how I used GINQ to fulfill it. A quick disclaimer: this is not a tutorial on GINQ. This blog is merely a discussion of how I’ve used GINQ and how I plan on making it part of my toolkit.

The Apache Groovy team overall seems to be doing a good job of documenting and using other features of Groovy. If you want to learn more about this feature, I suggest checking out this blog, Using GINQ.

Example One

Exporting a Subset of Data

Right after I wrote the aforementioned article, I had a requirement to export a subset of data from a backup for an application I wrote many years ago. The backup was in the JSON format, and, for various reasons, the application is no longer supported or supportable. Therefore, I had to work with the data from the backup.

The database and the JSON file contain membership data that consists of but is not limited to the following information:

1. Member’s name
2. Id field
3. Member’s membership number
4. Date of birth
5. List of email addresses associated with the member
6. List of postal addresses associated with the member
7. List of membership information called authorizations

The requirement was to export the member’s name, their active or inactive status, the keyword of their authorization, and their email address. The export would be for members that have certain authorizations only.

GINQ to the Rescue!

There are multiple ways of fulfilling this requirement outside of using GINQ. If I didn’t want to try out GINQ, I was willing to try one of three solutions. I’ll share those with you here just for the sake of being transparent, but there’s a reason I ended up using GINQ instead.

Solution 1: I could have gone back to the original database and written the export in SQL.

Solution 2: I could have imported the entire JSON file into another database, such as MySQL, and again used SQL.

Solution 3: I could have written something in Python that would read the JSON and then written something very SQL-like to export the data.

Why does SQL appear so much in the possible solutions? Simple. It’s a language that is very geared towards working with data. GINQ is SQL-based, and therefore, it’s also geared toward working with data.

My first go-to on scripting is Groovy. This is my preference because I have over 20 years of Java development experience and around 15 years of playing with Groovy. GINQ adds data-oriented programming to the mix that, in my opinion, makes it a better solution.

Are you wondering how GINQ helped me? The best way to explain is to show you. Here’s the code I wrote to do the export.

 
import groovy.json.JsonSlurper

def jsonS = new JsonSlurper()

def obj = jsonS.parse(new File("backup20211208.json"))

def result = GQ {
    from o in obj.members
    where o.authorization.code.contains("EQGR") 
    || o.authorization.code.contains("EQGG") 
    || o.authorization.code.contains("EQMA") 
    || o.authorization.code.contains("EQCC")  
    || o.authorization.code.contains("EQMC")  
    || o.authorization.code.contains("EQDR")  
    || o.authorization.code.contains("EQWJ")  
    || o.authorization.code.contains("EQFJ")
    orderby o.name
    select o.name, o.status, o.authorization.code, o.email[0]?.emailAddress
    }

println "Name,Status,Auths,Email"
result.each() {
    println "${it[0]},${it[1]},${it[2]},\"${it[3].join(',')}\",${it[4] ? it[4] : ''}"
}

The GINQ code is in the GQ block. This could have been accomplished in basic Groovy, but I feel this is easier to read and modify for different criteria.

Example Two

Comparing Two Data Sources

I’ve been tasked with determining if two data sources contain the same set or subset of records several times throughout my career. If they exist in the same database, it’s easily done in SQL. If the data is small, it is simply done with a good editor that has a different function.

However, in larger datasets with different sources, this can be much more challenging. For example, if one source is a database and the other is a web service or a downloaded file, comparing sets is going to be incredibly tricky.

GINQ to the Rescue!

In the past, I’ve used Python. Now, I won’t have to. GINQ gives me an excellent and easy way to use Groovy, and this will be my preference going forward. Being able to use logic that is based on an SQL-like, data-centric, language is better for scripts that are used for data analysis. We can put the focus on understanding the data.

I haven’t had the opportunity to use GINQ in a situation like this in the wild, but here’s a simple recreation of the logic to demonstrate. Let’s say I have two lists of books; one I have and another I want. Here is a quick script to find any books in the file that I want but have not been added to the ones I have.

import groovy.json.JsonSlurper

def jsonS = new JsonSlurper()

def books = jsonS.parse(new File("books.json"))
def booksIWant = jsonS.parse(new File("booksIWant.json"))

def result = GQ {
    from w in booksIWant
    where !(
        from b in books
        where b == w
        select b
    ).exists()
    select w
}

Conclusion

I’ve shown a couple of examples of how I use (or would use) GINQ. In both, GINQ allows me personally to be more efficient and effective. GINQ has already become an important part of my toolkit, and I’m thinking of more ways to use it. GINQ is a data-oriented language and, depending on the project, it can be easier to read and modify. For these reasons, I would encourage you to give it a shot in your own projects!

If you enjoyed this post, check out some of my others on the Keyhole Dev Blog!

5 3 votes
Article Rating
Subscribe
Notify of
guest
4 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments