Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rqd] [cuegui] Add support for Loki for frame logs #1577

Open
wants to merge 59 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 56 commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
3a00739
Remove org.slf4j:slf4j-log4j12 from build.gradle which is already pro…
lithorus Nov 4, 2024
0feaaa3
Merge remote-tracking branch 'lithorus/bootrun-issues' into loki-logg…
lithorus Nov 5, 2024
67d93a4
Add loki properties for cuebot
lithorus Nov 5, 2024
34bb230
Merge remote-tracking branch 'origin/master' into loki-loggerv2
lithorus Nov 6, 2024
ed66ee5
Disable loki by default
lithorus Nov 6, 2024
ab8f16c
Add loki details to runFrame object
lithorus Nov 6, 2024
6abc9b0
Add loki_client module for logging to Loki
lithorus Nov 9, 2024
7c847e4
Add LokiLogger for loki logging and switch to using it if it's enable…
lithorus Nov 9, 2024
c47d926
Add more labels
lithorus Nov 9, 2024
7ca5065
Add loki details to proto file
lithorus Nov 9, 2024
ea93517
Add loki details to job object
lithorus Nov 10, 2024
5f31cf6
Add new LokiView plugin for viewing loki logs
lithorus Nov 10, 2024
f72c194
Read logs from the server when session is selected
lithorus Nov 10, 2024
154f45c
Some small cleanup
lithorus Nov 10, 2024
8215b28
Merge branch 'AcademySoftwareFoundation:master' into loki-loggerv2
lithorus Nov 11, 2024
c53673a
Move to end of class
lithorus Nov 11, 2024
7e5bfc8
Small spelling fix
lithorus Nov 11, 2024
a72b116
Move to end of class
lithorus Nov 11, 2024
a1327af
Add loki options to test config
lithorus Nov 11, 2024
ee25f0f
Remove requests as a dependency and fix various linting
lithorus Nov 11, 2024
fd7f111
Remove requests as a dependency and fix various linting
lithorus Nov 11, 2024
a957703
Remove debug print statement
lithorus Nov 11, 2024
424687e
Use timestamp to set start range
lithorus Nov 11, 2024
ec68dcd
Re-add the translation setup
lithorus Nov 12, 2024
3b7f479
Move new attributes down
lithorus Nov 12, 2024
eeb80ca
Move new attributes down
lithorus Nov 12, 2024
495f134
Small docstring fix
lithorus Nov 12, 2024
daf0232
Fix pylint errors
lithorus Nov 12, 2024
ab06d5b
Small pylint fixes
lithorus Nov 12, 2024
b313e55
Small pylint fix
lithorus Nov 12, 2024
6070bfd
Force older version of urllib3 for python 3.9 and below
lithorus Nov 13, 2024
d6d882a
Specify urllib3 for other python versions
lithorus Nov 13, 2024
c9c3b26
Add some basic pytests for the rqdlogging.LokiLogger and add to CI
lithorus Nov 14, 2024
f7e79e6
Switch to use published loki-urllib3-client
lithorus Nov 17, 2024
1066e97
Use job start date to query frame labels with
lithorus Nov 17, 2024
6e6bf3e
Merge remote-tracking branch 'origin/master' into loki-loggerv2
lithorus Nov 18, 2024
28f3cdd
Bump number of migration sql script
lithorus Nov 18, 2024
0417051
Don't repeat yourself..
lithorus Nov 18, 2024
204be81
Small tweak for consistency
lithorus Nov 18, 2024
05e0d02
Merge remote-tracking branch 'origin/master' into loki-loggerv2
lithorus Nov 19, 2024
fda4f0d
Make sure that the frame dispatch returns a non-null value for job.st…
lithorus Nov 19, 2024
0ffff4d
Merge branch 'job_os-default-value' into loki-loggerv2
lithorus Nov 19, 2024
1539db9
Make sure that the frame dispatch returns a non-null value for job.st…
lithorus Nov 19, 2024
497136d
Merge branch 'job_os-default-value' into loki-loggerv2
lithorus Nov 19, 2024
dd8437a
Remove next and prev buttons
lithorus Nov 19, 2024
d5f5903
Move element creation to init
lithorus Nov 19, 2024
3df6774
Move everything into init
lithorus Nov 19, 2024
83438f0
Re-order init elements
lithorus Nov 19, 2024
23b4db1
Rename signal to more generic name
lithorus Nov 19, 2024
83995bd
Merge remote-tracking branch 'origin/master' into loki-loggerv2
lithorus Nov 20, 2024
21f0157
Revert "Make sure that the frame dispatch returns a non-null value fo…
lithorus Nov 27, 2024
f1b2ad6
Revert "Make sure that the frame dispatch returns a non-null value fo…
lithorus Nov 27, 2024
32d0787
Merge remote-tracking branch 'origin/master' into loki-loggerv2
lithorus Nov 27, 2024
5b62826
Merge branch 'AcademySoftwareFoundation:master' into loki-loggerv2
lithorus Nov 28, 2024
588fc9e
Merge remote-tracking branch 'origin/master' into loki-loggerv2
lithorus Nov 29, 2024
1363010
Merge branch 'master' into loki-loggerv2
DiegoTavares Dec 4, 2024
f00f17a
Merge remote-tracking branch 'origin/master' into loki-loggerv2
lithorus Dec 4, 2024
01f3870
Added small description on how to configure for rqd framelogs to Loki
lithorus Dec 4, 2024
f5ec53b
Added comments regarding new parameters used by rqd and cuegui
lithorus Dec 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion VERSION.in
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.3
1.4
1 change: 1 addition & 0 deletions ci/run_python_lint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -51,4 +51,5 @@ echo "Running lint for rqd/..."
cd rqd
python -m pylint --rcfile=../ci/pylintrc_main rqd --ignore=rqd/compiled_proto
python -m pylint --rcfile=../ci/pylintrc_test tests
python -m pylint --rcfile=../ci/pylintrc_test pytests
cd ..
1 change: 1 addition & 0 deletions ci/run_python_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ PYTHONPATH=pycue python -m unittest discover -s pyoutline/tests -t pyoutline -p
PYTHONPATH=pycue python -m unittest discover -s cueadmin/tests -t cueadmin -p "*.py"
PYTHONPATH=pycue:pyoutline python -m unittest discover -s cuesubmit/tests -t cuesubmit -p "*.py"
python -m pytest rqd/tests
python -m pytest rqd/pytests

# Xvfb no longer supports Python 2.
if [[ "$python_version" =~ "Python 3" && ${args[0]} != "--no-gui" ]]; then
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,5 +72,8 @@ public void setMinMemory(long minMemory) {
public long getMinMemory() {
return this.minMemory;
}

public boolean lokiEnabled;
public String lokiURL;
Comment on lines +77 to +78
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment explaining what what they are and where they are being set

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

3 changes: 3 additions & 0 deletions cuebot/src/main/java/com/imageworks/spcue/JobDetail.java
Original file line number Diff line number Diff line change
Expand Up @@ -59,5 +59,8 @@ public class JobDetail extends JobEntity implements JobInterface, DepartmentInte
public String getDepartmentId() {
return deptId;
}

public Boolean logLokiEnabled;
public String logLokiURL;
Comment on lines +65 to +66
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

Original file line number Diff line number Diff line change
Expand Up @@ -531,6 +531,8 @@ private static final String replaceQueryForFifo(String query) {
"int_uid, " +
"str_log_dir, " +
"COALESCE(str_os, '') AS str_os, " +
"b_loki_enabled, " +
"str_loki_url, " +
"frame_name, " +
"frame_state, " +
"pk_frame, " +
Expand Down Expand Up @@ -572,6 +574,8 @@ private static final String replaceQueryForFifo(String query) {
"job.int_uid, " +
"job.str_log_dir, " +
"job.str_os, " +
"job.b_loki_enabled, " +
"job.str_loki_url, " +
"frame.str_name AS frame_name, " +
"frame.str_state AS frame_state, " +
"frame.pk_frame, " +
Expand Down Expand Up @@ -659,6 +663,8 @@ private static final String replaceQueryForFifo(String query) {
"job.int_uid, " +
"job.str_log_dir, " +
"job.str_os, " +
"job.b_loki_enabled, " +
"job.str_loki_url, " +
"frame.str_name AS frame_name, " +
"frame.str_state AS frame_state, " +
"frame.pk_frame, " +
Expand Down Expand Up @@ -747,6 +753,8 @@ private static final String replaceQueryForFifo(String query) {
"job.int_uid, " +
"job.str_log_dir, " +
"job.str_os, " +
"job.b_loki_enabled, " +
"job.str_loki_url, " +
"frame.str_name AS frame_name, " +
"frame.str_state AS frame_state, " +
"frame.pk_frame, " +
Expand Down Expand Up @@ -828,6 +836,8 @@ private static final String replaceQueryForFifo(String query) {
"job.int_uid, " +
"job.str_log_dir, " +
"job.str_os, " +
"job.b_loki_enabled, " +
"job.str_loki_url, " +
"frame.str_name AS frame_name, " +
"frame.str_state AS frame_state, " +
"frame.pk_frame, " +
Expand Down Expand Up @@ -912,6 +922,8 @@ private static final String replaceQueryForFifo(String query) {
"job.int_uid, " +
"job.str_log_dir, " +
"job.str_os, " +
"job.b_loki_enabled, " +
"job.str_loki_url, " +
"frame.str_name AS frame_name, " +
"frame.str_state AS frame_state, " +
"frame.pk_frame, " +
Expand Down Expand Up @@ -999,6 +1011,8 @@ private static final String replaceQueryForFifo(String query) {
"job.int_uid, " +
"job.str_log_dir, " +
"job.str_os, " +
"job.b_loki_enabled, " +
"job.str_loki_url, " +
"frame.str_name AS frame_name, " +
"frame.str_state AS frame_state, " +
"frame.pk_frame, " +
Expand Down Expand Up @@ -1087,6 +1101,8 @@ private static final String replaceQueryForFifo(String query) {
"job.int_uid, " +
"job.str_log_dir, " +
"job.str_os, " +
"job.b_loki_enabled, " +
"job.str_loki_url, " +
"frame.str_name AS frame_name, " +
"frame.str_state AS frame_state, " +
"frame.pk_frame, " +
Expand Down Expand Up @@ -1168,6 +1184,8 @@ private static final String replaceQueryForFifo(String query) {
"job.int_uid, " +
"job.str_log_dir, " +
"job.str_os, " +
"job.b_loki_enabled, " +
"job.str_loki_url, " +
"frame.str_name AS frame_name, " +
"frame.str_state AS frame_state, " +
"frame.pk_frame, " +
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -332,6 +332,8 @@ public DispatchFrame mapRow(ResultSet rs, int rowNum) throws SQLException {
frame.version = rs.getInt("int_version");
frame.services = rs.getString("str_services");
frame.os = rs.getString("str_os");
frame.lokiEnabled = rs.getBoolean("b_loki_enabled");
frame.lokiURL = rs.getString("str_loki_url");
return frame;
}
};
Expand All @@ -349,6 +351,8 @@ public DispatchFrame mapRow(ResultSet rs, int rowNum) throws SQLException {
"job.int_uid,"+
"job.str_log_dir,"+
"COALESCE(str_os, '') AS str_os, " +
"job.b_loki_enabled,"+
"job.str_loki_url,"+
"frame.str_name AS frame_name, "+
"frame.str_state AS frame_state, "+
"frame.pk_frame, "+
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,8 @@ public JobDetail mapRow(ResultSet rs, int rowNum) throws SQLException {
job.showName = rs.getString("show_name");
job.facilityName = rs.getString("facility_name");
job.deptName = rs.getString("dept_name");
job.logLokiEnabled = rs.getBoolean("b_loki_enabled");
job.logLokiURL = rs.getString("str_loki_url");
return job;
}
};
Expand Down Expand Up @@ -206,6 +208,8 @@ public boolean isJobComplete(JobInterface job) {
"job.pk_dept,"+
"job.pk_folder,"+
"job.str_log_dir,"+
"job.b_loki_enabled,"+
"job.str_loki_url,"+
"job.str_name,"+
"job.str_shot,"+
"job.str_state,"+
Expand Down Expand Up @@ -473,20 +477,25 @@ public boolean updateJobFinished(JobInterface job) {
"int_uid," +
"b_paused," +
"b_autoeat,"+
"int_max_retries " +
"int_max_retries," +
"b_loki_enabled," +
"str_loki_url " +
") " +
"VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)";
"VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)";

@Override
public void insertJob(JobDetail j, JobLogUtil jobLogUtil) {
j.id = SqlUtil.genKeyRandom();
j.logDir = jobLogUtil.getJobLogPath(j);
j.logLokiEnabled = jobLogUtil.getLokiIsEnabled();
j.logLokiURL = jobLogUtil.getLokiURL();
if (j.minCoreUnits < 100) { j.minCoreUnits = 100; }

getJdbcTemplate().update(INSERT_JOB,
j.id, j.showId, j.groupId, j.facilityId, j.deptId,
j.name, j.name, j.showName, j.shot, j.user, j.email, j.state.toString(),
j.logDir, j.os, j.uid.orElse(null), j.isPaused, j.isAutoEat, j.maxRetries);
j.logDir, j.os, j.uid.orElse(null), j.isPaused, j.isAutoEat, j.maxRetries,
j.logLokiEnabled, j.logLokiURL);
}

private static final String JOB_EXISTS =
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1190,7 +1190,9 @@ public Job mapRow(ResultSet rs, int rowNum) throws SQLException {
.setHasComment(rs.getBoolean("b_comment"))
.setAutoEat(rs.getBoolean("b_autoeat"))
.setStartTime((int) (rs.getTimestamp("ts_started").getTime() / 1000))
.setOs(SqlUtil.getString(rs,"str_os"));
.setOs(SqlUtil.getString(rs,"str_os"))
.setLokiEnabled(rs.getBoolean("b_loki_enabled"))
.setLokiUrl(SqlUtil.getString(rs, "str_loki_url"));

int uid = rs.getInt("int_uid");
if (!rs.wasNull()) {
Expand Down Expand Up @@ -1935,6 +1937,8 @@ public Show mapRow(ResultSet rs, int rowNum) throws SQLException {
"SELECT " +
"job.pk_job,"+
"job.str_log_dir," +
"job.b_loki_enabled," +
"job.str_loki_url," +
"job_resource.int_max_cores," +
"job_resource.int_min_cores," +
"job_resource.int_max_gpus," +
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -385,6 +385,8 @@ public RunFrame prepareRqdRunFrame(VirtualProc proc, DispatchFrame frame) {
.setShow(frame.show)
.setUserName(frame.owner)
.setLogDir(frame.logDir)
.setLokiEnabled(frame.lokiEnabled)
.setLokiUrl(frame.lokiURL)
.setJobId(frame.jobId)
.setJobName(frame.jobName)
.setFrameId(frame.id)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,4 +66,12 @@ public String getJobLogRootDir(String os) {
return env.getRequiredProperty("log.frame-log-root.default_os", String.class);
}
}
}

public Boolean getLokiIsEnabled() {
return env.getRequiredProperty("log.loki.enabled", Boolean.class);
}

public String getLokiURL() {
return env.getRequiredProperty("log.loki.url", String.class);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
alter table job
add b_loki_enabled bool;

alter table job
add str_loki_url varchar(256);
Comment on lines +1 to +5
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not completely sold on the idea of every job having these set. I agree runFrame is the best place to communicate this value from cuebot to rqd, but having to store this on the job table doesn't make too much sense to me.

prepareRqdRunFrame on dispatch support could read the logpath directly from jobLogUtil, keeping the database out of the loop


4 changes: 4 additions & 0 deletions cuebot/src/main/resources/opencue.properties
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,10 @@ log.frame-log-root.default_os=${CUE_FRAME_LOG_DIR:/shots}
# are planning to use a folder in the root, use:
# - log.frame-log-root.Windows=${S:}

# Loki
log.loki.enabled=false
log.loki.url=http://localhost/loki/api
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please elaborate on the docs here. I would add:

  • link to a tutorial on how to setup loki
  • quick description of what happens when this feature is on
  • what to expect if an invalid url is provided

Copy link
Contributor Author

@lithorus lithorus Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have added small description. Will add general Loki documentaion on opencue.io


# Maximum number of jobs to query.
dispatcher.job_query_max=20
# Number of seconds before waiting to book the same job from a different host.
Expand Down
6 changes: 5 additions & 1 deletion cuebot/src/test/resources/opencue.properties
Original file line number Diff line number Diff line change
Expand Up @@ -93,4 +93,8 @@ dispatcher.memory.mem_reserved_min = 262144
dispatcher.memory.mem_reserved_system = 524288
dispatcher.memory.mem_gpu_reserved_default = 0
dispatcher.memory.mem_gpu_reserved_min = 0
dispatcher.memory.mem_gpu_reserved_max = 104857600
dispatcher.memory.mem_gpu_reserved_max = 104857600

# Loki
log.loki.enabled = false
log.loki.url = http://localhost/loki/api
1 change: 1 addition & 0 deletions cuegui/cuegui/App.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ class CueGuiApplication(QtWidgets.QApplication):

# Global signals
display_log_file_content = QtCore.Signal(object)
select_frame = QtCore.Signal(object, object)
double_click = QtCore.Signal(object)
facility_changed = QtCore.Signal()
single_click = QtCore.Signal(object)
Expand Down
1 change: 1 addition & 0 deletions cuegui/cuegui/FrameMonitorTree.py
Original file line number Diff line number Diff line change
Expand Up @@ -359,6 +359,7 @@ def __itemSingleClickedViewLog(self, item, col):
old_log_files = []

self.app.display_log_file_content.emit([current_log_file] + old_log_files)
self.app.select_frame.emit(self.__job, item.rpcObject)

def __itemDoubleClickedViewLog(self, item, col):
"""Called when a frame is double clicked, views the frame log in a popup
Expand Down
Loading