The aim is to develop a Web Application functional with REST APIs for consuming data from hive datawarehouse
This app will query the hive data warehouse using requested columns, filters and limit. It will return all the information in form of a file to the user.
- Functionalities
- Architecture
- Client
- Server
- Technologies used
- How to run?
- Swagger Documentation
- Future Goals
- Acknowledgement
Query a single hive table using column names, apply filters using where clause and add limit.
We have multiple REST APIs for different functions.
- It is a POST api
Aim
: To save the details of form submitted by the user, validate the details and return the file location in which query results will be stored and UUID to the user- Can be hit using
localhost:8080/api/save
. It is called on form submit by the user - All the information submitted by user through the form is given in a JSON format in the body of the api as shown below
{
"columns": ["name", "age"],
"filters": ["name"],
"limit": "100",
"table": "medicare_demographic",
"db": "default"
}
- A UUID is assigned to submitted
Request
- A response
Valid
is added to thisRequest
and stored asGetResponse
- An Example of
GetResponse
object is shown below.
{
response:Valid,
columns:[name, age],
filters:[name],
limit:100,
table:medicare_demographic,
database:default
}
- The above
GetResponse
is stored in a map<UUID, GetResponse>
. This map will contain the updated status of hive query from the backend - An example of
RequestMap
is shown below
{
2a5c211d-5b24-43ac-b1f4-362d3b3abe1d :
{
response:Valid,
columns:[name, age],
filters:[name],
limit:100,
table:medicare_demographic,
database:default
}
}
- Validate the
Request
and return the UUID, file location and response to the user
- It is a GET api
Aim
: To return the current status of query from the hashmap- Can be hit using
localhost:8080/api/status/{UUID}
. It is called on refresh button click by the user
- It is a GET api
Aim
: To return list of databases collected from the hive warehouse- Can be hit using
localhost:8080/api/getdbs
- It is a GET api
Aim
: To return list of tables collected from the hive warehouse for a particular database- Can be hit using
localhost:8080/api/gettables/{db}
- It is a POST api
- It takes in the database name and table name as its payload
Aim
: To return list of columns collected from the hive warehouse for a particular database and table- Can be hit using
localhost:8080/api/getcols
input params
: Array of column names, Array of filters, limit, source name, database name- It will check if source and database exists in the data warehouse
- It will then check if the given columns exists in given database and the limit provided is valid
- It will also run the data type matching function
- It uses regex pattern matcher for appropriate data type matching and left and right clauses of a filter condition
- Wholsome checks have been used to validate the LHS of filter, the in-between operator type followed by data type check based on LHS columns, all of which together make the system safe to sql injections
- Appropriate error/valid conditions are set as the returning message, which in turn is used to set the response variable to be added to the global hashmap
- It is scheduled to run for every 1 sec.
- It will get the UUID key with value
Valid
from the hashmap and run the query according to theRequest
and update the value asStarted and Running
in the hashmap - It will update the response of query in hashmap as
Complete
orFailure
on the successful or unsuccessful query completion respectively - It will write 'No records found' in the generated file if the validation and query execution is successful but number of matching records is null
- Spring boot
- JDK 8
- Hive Query Language
- Angular
- TypeScript
- HTML/CSS
- Apache Hadoop
- Apache Hive
- Maven
- Spring boot Swagger UI
- On windows
- Run server, using maven:
.\mvnw spring-boot:run
- Run client, using ng:
ng serve
- Run server, using maven:
- hit
localhost:8080/swagger-ui.html
Thanks to Ishita and Parnika for contributing robust validation and query method and other rest apis.
Table of contents generated with markdown-toc