Skip to content

Extend Codeanalyzer to Capture Database Entries #100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rahlk opened this issue Feb 1, 2025 · 2 comments · Fixed by #110
Open

Extend Codeanalyzer to Capture Database Entries #100

rahlk opened this issue Feb 1, 2025 · 2 comments · Fixed by #110
Assignees

Comments

@rahlk
Copy link
Collaborator

rahlk commented Feb 1, 2025

Is your feature request related to a problem? Please describe

The codeanalyzer currently lacks database operation analysis capabilities. It cannot detect, represent or analyze CRUD (Create, Read, Update, Delete) operations in Java classes, making it difficult to understand database interactions and data flow in applications.

Describe the solution you'd like

Add a CRUDOperation and CRUDQuery class to represent database operations with:

  1. Detection of database-related methods/annotations:
  • @Query, @Insert, @Update, @Delete annotations
  • JDBC/JPA/Hibernate method calls (persist, merge, remove, find, executeQuery, executeUpdate)
  • SQL/JPQL strings in code
  1. Properties to capture:
class CRUDOperation {
    private CRUDOperationType operationType;  // An Enum with CREATE, READ, UPDATE, DELETE
    private int line = -1;       // Source code location details
}

class CRUDQuery {
    private String queryString;    // Raw SQL or JPQL
    private CRUDQueryType queryType;  // An Enum with READ and WRITE
    private int line = -1;       // Source code location details
}
  1. Integration with CallSite and Callable objects to track database operations:
class CallSite {
    // Existing fields...
    private CRUDOperation crudOperation;
    private CRUDQuery crudQuery;
}

class Callable {
    // Existing fields...
    private List<CRUDOperation> crudOperations = new ArrayList<>();
    private List<CRUDQuery> crudQueries = new ArrayList<>();
}

Describe alternatives you've considered

  • Parsing raw SQL strings - Limited, misses ORM operations
  • Static analysis of database calls - Complex implementation
  • Configuration-based approach - Less accurate, requires manual updates

Additional context

The solution should support common Java persistence frameworks:

  • Jakarta and JavaEE
  • Spring Data JPA
  • Hibernate
  • JDBC
  • MyBatis
@rahlk rahlk self-assigned this Feb 1, 2025
@rahlk
Copy link
Collaborator Author

rahlk commented Feb 7, 2025

🧠 Brain-dump 🧠

Spring is actually a bit tricky. This is what we may have to do---

When we go the call site within a method say foo()...

  1. Get the receiver type (recvType) and called method name (calledMethName) from the callsite.

  2. If the receiver type (recvType) extends Repository (or Sub-Interfaces like JpaRepository), then do the following:

    a. see if the calledMethName implicitly match CRUD operations based on naming conventions (e.g., findBy*, deleteBy*, save*).

    b. Alternatively, check to see if the called method has a @Query annotation, and if so then decipher and categorize the custom query.

Methodological note

There is no explicit CURDQuery callsite per-se in Spring, it is the same callsite as CRUDOperation if the method called has @Query annotation. So, to populate the CRUDQuery object, we'll have to now get the annotation arguments too.

Moreover! The query in Spring can either be SQL or a dialect of SQL called JPQL. So the below tow are valid:

  1. JPQL

    @Query("SELECT u FROM User u WHERE u.name = :name")  
    List<User> findByName(String name);
  2. Traditional SQL

    @Query(value = "SELECT * FROM users WHERE name = :name", nativeQuery = true)
    List<User> findByName(String name);

But, it seems like there are well maintained parsers for these two! Hibernate's HQL parser, and JSQLParser.

@rahlk
Copy link
Collaborator Author

rahlk commented Feb 8, 2025

🧠 Brain-dump 🧠

In JPA, there is an ambiguity with determining whether query.executeUpdate() is an UPDATE or a DELETE operation. Although the name says executeUpdate, the user can create a query using entityManager.createQuery which is a delete operation as below

Query q = entityManager.createQuery("DELETE FROM User u WHERE u.status = 'inactive'");

And then somewhere else, they may call the executeUpdate() method.

q.executeUpdate();

Note that the above two examples needn't be in the same method or even in the same class. Take the below example for instance--

public class QueryBuilder {
    public static Query foo(EntityManager em) {
        return em.createQuery("DELETE FROM User u WHERE u.status = 'inactive'");
    }
}

Then, in some other class--

public class UserService {
    public void bar(EntityManager em) {
        Query updateQuery = QueryBuilder.foo(em);
        updateQuery.executeUpdate();  // UPDATE operation, but the execution context doesn't reveal that directly
    }
}

So, unless we do dataflow analysis on foo, then find the call site within foo to createQuery and then parse the SQL or JPQL string to determine if it's create or delete, we can't really know what exactly was executed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant