Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to parse ISO8601-compliant timestamp with a colon in the timezone #1367

Open
MakotoTheKnight opened this issue Jan 14, 2025 · 15 comments
Labels

Comments

@MakotoTheKnight
Copy link

MakotoTheKnight commented Jan 14, 2025

lnav version
v0.12.3

Describe the bug
When using lnav against Java logs with a timezone that includes a colon, the parsing for it fails and reverts to using the generic log configuration. This eliminates the ability for me to use the logs, or causes to logs to "appear" out of sync to lnav if I aggregate more than one file together.

To Reproduce

  1. Start a Spring Boot application from the initializer and use any default logging configuration.
  2. Run the application and write logs to a file.
  3. Observe that the default configuration includes a colon.
  4. Observe that lnav will not match on java_log, but will match on generic_log.

I'm thinking about some pre-processing to do a workaround in lnav but would think that this should be addressed at the root.

@tstack
Copy link
Owner

tstack commented Jan 14, 2025

  1. Start a Spring Boot application from the initializer and use any default logging configuration.
  2. Run the application and write logs to a file.

I really don't have time to do something like this, can you attach some logs please.

@MakotoTheKnight
Copy link
Author

Sure. I can provide a handful of logs that aren't super sensitive from my current project.

2025-01-14T16:10:41.762-07:00  INFO 93649 --- [Course Creation] [           main] [                                                 ] e.c.o.se.coursecreation.CourseCreation   : Starting CourseCreation using Java 17.0.10 with PID 93649
2025-01-14T16:10:41.764-07:00  INFO 93649 --- [Course Creation] [           main] [                                                 ] e.c.o.se.coursecreation.CourseCreation   : No active profile set, falling back to 1 default profile: "default"
2025-01-14T16:10:42.226-07:00  INFO 93649 --- [Course Creation] [cTaskExecutor-1] [                                                 ] com.tngtech.archunit.core.PluginLoader   : Detected Java version 17.0.10
2025-01-14T16:10:42.724-07:00  INFO 93649 --- [Course Creation] [           main] [                                                 ] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat initialized with port 8080 (http)
2025-01-14T16:10:42.728-07:00  INFO 93649 --- [Course Creation] [           main] [                                                 ] o.apache.catalina.core.StandardService   : Starting service [Tomcat]
2025-01-14T16:10:42.729-07:00  INFO 93649 --- [Course Creation] [           main] [                                                 ] o.apache.catalina.core.StandardEngine    : Starting Servlet engine: [Apache Tomcat/10.1.34]
2025-01-14T16:10:42.748-07:00  INFO 93649 --- [Course Creation] [           main] [                                                 ] o.a.c.c.C.[Tomcat].[localhost].[/]       : Initializing Spring embedded WebApplicationContext
2025-01-14T16:10:42.749-07:00  INFO 93649 --- [Course Creation] [           main] [                                                 ] w.s.c.ServletWebServerApplicationContext : Root WebApplicationContext: initialization completed in 971 ms
2025-01-14T16:10:43.178-07:00  INFO 93649 --- [Course Creation] [           main] [                                                 ] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Starting...
2025-01-14T16:10:43.274-07:00  INFO 93649 --- [Course Creation] [           main] [                                                 ] com.zaxxer.hikari.pool.HikariPool        : HikariPool-1 - Added connection conn0: url=jdbc:h2:file:./modulith-test-db user=SA
2025-01-14T16:10:43.275-07:00  INFO 93649 --- [Course Creation] [           main] [                                                 ] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Start completed.
2025-01-14T16:10:43.278-07:00  INFO 93649 --- [Course Creation] [           main] [                                                 ] o.s.b.a.h2.H2ConsoleAutoConfiguration    : H2 console available at '/h2-console'. Database available at 'jdbc:h2:file:./modulith-test-db'
2025-01-14T16:10:43.314-07:00  INFO 93649 --- [Course Creation] [           main] [                                                 ] o.s.b.a.e.web.EndpointLinksResolver      : Exposing 0 endpoints beneath base path '/actuator'
2025-01-14T16:10:43.348-07:00  INFO 93649 --- [Course Creation] [           main] [                                                 ] o.s.b.w.embedded.tomcat.TomcatWebServer  : Tomcat started on port 8080 (http) with context path '/'
2025-01-14T16:10:43.357-07:00  INFO 93649 --- [Course Creation] [           main] [                                                 ] o.s.s.config.TaskSchedulerRouter         : No TaskScheduler/ScheduledExecutorService bean found for scheduled processing
2025-01-14T16:10:43.359-07:00  INFO 93649 --- [Course Creation] [           main] [                                                 ] e.c.o.se.coursecreation.CourseCreation   : Started CourseCreation in 1.748 seconds (process running for 2.253)
2025-01-14T16:10:43.907-07:00  INFO 93649 --- [Course Creation] [on(3)-10.0.0.31] [                                                 ] o.a.c.c.C.[Tomcat].[localhost].[/]       : Initializing Spring DispatcherServlet 'dispatcherServlet'
2025-01-14T16:10:43.908-07:00  INFO 93649 --- [Course Creation] [on(3)-10.0.0.31] [                                                 ] o.s.web.servlet.DispatcherServlet        : Initializing Servlet 'dispatcherServlet'
2025-01-14T16:10:43.908-07:00  INFO 93649 --- [Course Creation] [on(3)-10.0.0.31] [                                                 ] o.s.web.servlet.DispatcherServlet        : Completed initialization in 0 ms
2025-01-14T16:10:44.175-07:00  INFO 93649 --- [Course Creation] [         task-1] [6618a4de8f36bf71c60b2982181e6d36-c66827472589372a] e.c.o.s.c.c.ConsumerService              : Received new unassociated section CanvasSection(course=ABCD1234, term=2251, sectionSourcedId=123456-01-2251-B-100-12345, title=A test title, section=null, sessionCode=null, activityType=LEC, classNumber=null, academicOrgCode=null, startDate=2025-01-11, endDate=2025-05-09, enrollments=[CanvasEnrollment(emplid=123456, role=PI, sectionSourcedId=123456-01-2251-B-100-12345)])
2025-01-14T16:10:46.274-07:00  INFO 93649 --- [Course Creation] [         task-3] [4ed1870ce57ec39c66e7263fc6409852-327c2f911afe211e] e.c.o.s.c.c.ConsumerService              : Received new associated sections AssociatedCanvasSections(sections=[CanvasSection(course=ABCD1234, term=2251, sectionSourcedId=123456-01-2251-B-100-12345, title=A test title, section=null, sessionCode=null, activityType=LEC, classNumber=null, academicOrgCode=null, startDate=2025-01-11, endDate=2025-05-09, enrollments=[CanvasEnrollment(emplid=123456, role=PI, sectionSourcedId=123456-01-2251-B-100-12345)]), CanvasSection(course=ABCD1234, term=2251, sectionSourcedId=123456-01-2251-B-100-12345, title=A test title, section=null, sessionCode=null, activityType=LEC, classNumber=null, academicOrgCode=null, startDate=2025-01-11, endDate=2025-05-09, enrollments=[CanvasEnrollment(emplid=8945185, role=SI, sectionSourcedId=123456-01-2251-B-100-12345)])])
2025-01-14T16:10:46.451-07:00  INFO 93649 --- [Course Creation] [         task-4] [aea62c093e97acd94d21491593d9d04b-ff223aef48662139] e.c.o.s.c.c.ConsumerService              : Received new unassociated section CanvasSection(course=ABCD1234, term=2251, sectionSourcedId=123456-01-2251-B-100-12345, title=A test title, section=null, sessionCode=null, activityType=LEC, classNumber=null, academicOrgCode=null, startDate=2025-01-11, endDate=2025-05-09, enrollments=[CanvasEnrollment(emplid=123456, role=PI, sectionSourcedId=123456-01-2251-B-100-12345)])
2025-01-14T16:10:47.921-07:00  WARN 93649 --- [Course Creation] [_WorkerThread-1] [                                                 ] i.o.exporter.zipkin.ZipkinSpanExporter   : Failed to export spans

@tstack
Copy link
Owner

tstack commented Jan 14, 2025

Which of these patterns were you expecting to match these logs?

"regex": {
"jvm": {
"pattern": "^(?<level>\\w+)\\s+\\|\\s+jvm (?<jvm_no>\\d+)\\s+\\|\\s(?<timestamp>\\d{4}/\\d{2}/\\d{2} \\d{2}:\\d{2}:\\d{2})\\s+\\| (?<timestamp_f>\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2},\\d{3}) \\[(?<function>\\w+-\\d+)\\]\\s+(?<debug_level>\\w+)\\s+(?<class>[\\w.]+)\\s+-\\s+(?<body>.*)"
},
"dump": {
"pattern": "^(?<level>\\w+)\\s+\\|\\s+jvm (?<jvm_no>\\d+)\\s+\\|\\s(?<timestamp>\\d{4}/\\d{2}/\\d{2} \\d{2}:\\d{2}:\\d{2})\\s+\\| JVMDUMP\\w+\\s(?<body>.*)$"
},
"tasko": {
"pattern": "^(?<timestamp>\\d{4}-\\d{2}-\\d{2}( |T)\\d{2}:\\d{2}:\\d{2}(,|\\.)\\d{3}(?:Z|[-+]\\d{2}:?\\d{2})?)\\s+\\[(?<thread>[\\w\\-\\.]+)\\]\\s+(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\\s+(?<class>[\\w.]+)\\s+(-\\s+)?(?<body>.*)$"
},
"prefix-brackets": {
"pattern": "^\\[(?<timestamp>\\d{4}-\\d{2}-\\d{2}( |T)\\d{2}:\\d{2}:\\d{2}(,|\\.)\\d{3}(?:Z|[-+]\\d{2}:?\\d{2})?)\\s+(?<thread>[\\w\\-\\.]+)\\s+(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\\s+(?<class>[\\w.]+)(?:\\s+opId=(?<opid>[^\\]]*))?\\]\\s*(-\\s+)?(?<body>.*)$"
},
"in-brackets": {
"pattern": "^(?<timestamp>\\d{4}-\\d{2}-\\d{2}( |T)\\d{2}:\\d{2}:\\d{2}(,|\\.)\\d{3}(?:Z|[-+]\\d{2}:?\\d{2})?)\\s+\\[(?<thread>[\\w\\-\\.]+)(?:\\s+(?:\\[\\]|null))?\\s+(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\\s+(?<class>[\\w.]+)(?:\\s+opId=(?<opid>[^\\]]*))?\\]\\s*(-\\s+)?(?<body>.*)$"
},
"nobrackets": {
"pattern": "^(?<timestamp>\\d{4}-\\d{2}-\\d{2}( |T)\\d{2}:\\d{2}:\\d{2}(,|\\.)\\d{3}(?:Z|[-+]\\d{2}:?\\d{2})?)\\s+(?<thread>[\\w\\-\\.]+) (?:operationID=(?<opid>\\S+))?\\s+(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\\s+(?<class>[\\w.]+)\\s+(-\\s+)?(?<body>.*)$"
},
"vmw1": {
"pattern": "^(?<timestamp>\\d{4}-\\d{2}-\\d{2}( |T)\\d{2}:\\d{2}:\\d{2}(,|\\.)\\d{3}(?:Z|[-+]\\d{2}:?\\d{2})?)\\s*\\|\\s*(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\\s*\\|\\s*(?<thread>[^\\|]+)\\s*\\|\\s*(?<srcfile>[^\\|]+)\\s*\\|\\s*(?<srcline>\\d+)\\s*\\|\\s*(?<body>.*)$"
},
"vmw2": {
"pattern": "^\\[(?<timestamp>\\d{4}-\\d{2}-\\d{2}( |T)\\d{2}:\\d{2}:\\d{2}(,|\\.)\\d{3}(?:Z|[-+]\\d{2}:?\\d{2})?)\\]\\s*(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\\s*\\d+\\[(?<thread>[^\\]]+)\\]\\s+-\\s+(?<class>[^\\(]+)\\.(?<method>\\w+)\\((?<srcfile>[^:]+):(?<srcline>\\d+)\\)\\s+-\\s+(?<body>.*)$"
},
"vmw3": {
"pattern": "^(?<timestamp>\\d{4}-\\d{2}-\\d{2}( |T)\\d{2}:\\d{2}:\\d{2}(,|\\.)\\d{3}(?:Z|[-+]\\d{2}:?\\d{2})?)\\s*\\|\\s*(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\\s*\\|\\s*(?<thread>[^\\|]+)\\s*\\|\\s*(?<class>[a-zA-Z_\\.][^\\|]+)\\s*\\|\\s+(?!\\d+\\s*\\|)(?<body>.*)$"
},
"vmw-sso": {
"pattern": "^(?<timestamp>\\d{4}-\\d{2}-\\d{2}( |T)\\d{2}:\\d{2}:\\d{2}(,|\\.)\\d{3}(?:Z|[-+]\\d{2}:?\\d{2})?)\\s+(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\\s+[\\w\\-]+\\[\\d+:(?<thread>[^\\]]+)\\]\\s+\\[CorId=(?<corid>[^\\s\\]]*)(?:\\s+OpId=(?<opid>[^\\]]*))?\\]\\s+\\[(?<class>[^\\]]+)\\]\\s+(?<body>.*)$"
},
"vmw-sps": {
"pattern": "^(?<timestamp>\\d{4}-\\d{2}-\\d{2}( |T)\\d{2}:\\d{2}:\\d{2}(,|\\.)\\d{3}(?:Z|[-+]\\d{2}:?\\d{2})?)\\s+\\[(?<thread>[^\\]]+)\\]\\s+(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\\s+opId=(?<opid>\\S*)\\s+(?<class>\\S+)\\s+-\\s+(?<body>.*)$"
},
"level-thread-class": {
"pattern": "^(?<timestamp>\\d{4}-\\d{2}-\\d{2}[ T]\\d{2}:\\d{2}:\\d{2}[,\\.]\\d{1,3}) (?<level>\\w+)\\s+\\[(?<thread>[^\\]]+)\\] (?<class>[^:]+): (?<body>.*)"
},
"level-class-src": {
"pattern": "^(?<timestamp>\\d{4}-\\d{2}-\\d{2}[ T]\\d{2}:\\d{2}:\\d{2}[,\\.]\\d{1,3}) (?<level>\\w+)\\s+(?<class>[\\.\\w]+) \\((?<srcfile>[^:]+):(?<function>[^(]+)\\((?<srcline>\\d+)\\)\\) - (?<body>.*)"
},
"repeated": {
"pattern": "^(?<timestamp>\\d{4}-\\d{2}-\\d{2}[ T]\\d{2}:\\d{2}:\\d{2}[,\\.]\\d{1,3}) \\^\\^\\^ \\d+ similar messages? omitted \\^\\^\\^$"
}
},

If I add the first line of your logs as a sample, I get this output:

 = note: the following shows how each pattern matched this sample:
            2025-01-14T15:10:41.762-08:00  INFO 93649 --- [Course Creation] [           main] [                                                 ] e.c.o.se.coursecreation.CourseCreation   : Starting CourseCreation using Java 17.0.10 with PID 93649
                ^ dump matched up to here
                                           ^ in-brackets matched up to here
                ^ jvm matched up to here
                                   ^ level-class-src matched up to here
                                   ^ level-thread-class matched up to here
                                                ^ nobrackets matched up to here
            ^ prefix-brackets matched up to here
                                   ^ repeated matched up to here
                                           ^ tasko matched up to here
                                           ^ vmw-sps matched up to here
                                                     ^ vmw-sso matched up to here
                                           ^ vmw1 matched up to here
            ^ vmw2 matched up to here
                                           ^ vmw3 matched up to here
 = note: dump               = “^(?<level>\w+)\s+\|\s+jvm (?<jvm_no>\d+)\s+\|\s(?<timestamp>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2})\s+\| JVMDUMP\w+\s(?<body>.*)$”
         in-brackets        = “^(?<timestamp>\d{4}-\d{2}-\d{2}( |T)\d{2}:\d{2}:\d{2}(,|\.)\d{3}(?:Z|[-+]\d{2}:?\d{2})?)\s+\[(?<thread>[\w\-\.]+)(?:\s+(?:\[\]|null))?\s+(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\s+(?<class>[\w.]+)(?:\s+opId=(?<opid>[^\]]*))?\]\s*(-\s+)?(?<body>.*)$”
         jvm                = “^(?<level>\w+)\s+\|\s+jvm (?<jvm_no>\d+)\s+\|\s(?<timestamp>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2})\s+\| (?<timestamp_f>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) \[(?<function>\w+-\d+)\]\s+(?<debug_level>\w+)\s+(?<class>[\w.]+)\s+-\s+(?<body>.*)”
         level-class-src    = “^(?<timestamp>\d{4}-\d{2}-\d{2}[ T]\d{2}:\d{2}:\d{2}[,\.]\d{1,3}) (?<level>\w+)\s+(?<class>[\.\w]+) \((?<srcfile>[^:]+):(?<function>[^(]+)\((?<srcline>\d+)\)\) - (?<body>.*)”
         level-thread-class = “^(?<timestamp>\d{4}-\d{2}-\d{2}[ T]\d{2}:\d{2}:\d{2}[,\.]\d{1,3}) (?<level>\w+)\s+\[(?<thread>[^\]]+)\] (?<class>[^:]+): (?<body>.*)”
         nobrackets         = “^(?<timestamp>\d{4}-\d{2}-\d{2}( |T)\d{2}:\d{2}:\d{2}(,|\.)\d{3}(?:Z|[-+]\d{2}:?\d{2})?)\s+(?<thread>[\w\-\.]+) (?:operationID=(?<opid>\S+))?\s+(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\s+(?<class>[\w.]+)\s+(-\s+)?(?<body>.*)$”
         prefix-brackets    = “^\[(?<timestamp>\d{4}-\d{2}-\d{2}( |T)\d{2}:\d{2}:\d{2}(,|\.)\d{3}(?:Z|[-+]\d{2}:?\d{2})?)\s+(?<thread>[\w\-\.]+)\s+(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\s+(?<class>[\w.]+)(?:\s+opId=(?<opid>[^\]]*))?\]\s*(-\s+)?(?<body>.*)$”
         repeated           = “^(?<timestamp>\d{4}-\d{2}-\d{2}[ T]\d{2}:\d{2}:\d{2}[,\.]\d{1,3}) \^\^\^ \d+ similar messages? omitted \^\^\^$”
         tasko              = “^(?<timestamp>\d{4}-\d{2}-\d{2}( |T)\d{2}:\d{2}:\d{2}(,|\.)\d{3}(?:Z|[-+]\d{2}:?\d{2})?)\s+\[(?<thread>[\w\-\.]+)\]\s+(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\s+(?<class>[\w.]+)\s+(-\s+)?(?<body>.*)$”
         vmw-sps            = “^(?<timestamp>\d{4}-\d{2}-\d{2}( |T)\d{2}:\d{2}:\d{2}(,|\.)\d{3}(?:Z|[-+]\d{2}:?\d{2})?)\s+\[(?<thread>[^\]]+)\]\s+(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\s+opId=(?<opid>\S*)\s+(?<class>\S+)\s+-\s+(?<body>.*)$”
         vmw-sso            = “^(?<timestamp>\d{4}-\d{2}-\d{2}( |T)\d{2}:\d{2}:\d{2}(,|\.)\d{3}(?:Z|[-+]\d{2}:?\d{2})?)\s+(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\s+[\w\-]+\[\d+:(?<thread>[^\]]+)\]\s+\[CorId=(?<corid>[^\s\]]*)(?:\s+OpId=(?<opid>[^\]]*))?\]\s+\[(?<class>[^\]]+)\]\s+(?<body>.*)$”
         vmw1               = “^(?<timestamp>\d{4}-\d{2}-\d{2}( |T)\d{2}:\d{2}:\d{2}(,|\.)\d{3}(?:Z|[-+]\d{2}:?\d{2})?)\s*\|\s*(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\s*\|\s*(?<thread>[^\|]+)\s*\|\s*(?<srcfile>[^\|]+)\s*\|\s*(?<srcline>\d+)\s*\|\s*(?<body>.*)$”
         vmw2               = “^\[(?<timestamp>\d{4}-\d{2}-\d{2}( |T)\d{2}:\d{2}:\d{2}(,|\.)\d{3}(?:Z|[-+]\d{2}:?\d{2})?)\]\s*(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\s*\d+\[(?<thread>[^\]]+)\]\s+-\s+(?<class>[^\(]+)\.(?<method>\w+)\((?<srcfile>[^:]+):(?<srcline>\d+)\)\s+-\s+(?<body>.*)$”
         vmw3               = “^(?<timestamp>\d{4}-\d{2}-\d{2}( |T)\d{2}:\d{2}:\d{2}(,|\.)\d{3}(?:Z|[-+]\d{2}:?\d{2})?)\s*\|\s*(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\s*\|\s*(?<thread>[^\|]+)\s*\|\s*(?<class>[a-zA-Z_\.][^\|]+)\s*\|\s+(?!\d+\s*\|)(?<body>.*)$”

It seems like a new regex needs to be added. Note that you can patch existing log formats. Would you like some help writing the regex?

@MakotoTheKnight
Copy link
Author

I would expect the jvm pattern to match. I'm comfortable with getting some help on the regex as well; I know that there's one additional box there (for some tracing) that isn't 100% conventional for most Java applications, but I think I was mostly lost on working around the timestamp.

tstack added a commit that referenced this issue Jan 15, 2025
@tstack
Copy link
Owner

tstack commented Jan 15, 2025

I've pushed a change that adds the following pattern that matches your logs:

"spring1": {
"pattern": "(?<timestamp>\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d{3}(?:Z|(?:-|\\+)\\d{2}:?\\d{2}))\\s+(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\\s+(?<pid>\\d+)\\s+---\\s+\\[[^\\]]+\\]\\s+\\[\\s*(?<thread>[^\\]]+)\\]\\s+\\[\\s*(?<opid>[^\\]]+)\\s*\\]\\s+(?<class>[^: ]+)\\s*:\\s+(?<body>.*)"
}

However, I'm not totally sure about all the fields. For the 3rd column with the number 93649, I thought it might be the process ID, so the field is named pid. I'm not sure if the three dashes near the beginning, I'm matching them exactly. Not sure if it actually means a null value or something. I didn't capture the 4th column ([Course Creation]), since I am not sure what it means.

If you can try it out and clue me in, that would be helpful. Thanks!

@MakotoTheKnight
Copy link
Author

I've installed the new log format using lnav -i and I'm not seeing a match on some of my logs. I think this could be an issue with how I've got things set up, and I have more insight into the regex that I may be able to come up with more examples and be able to build more of the regex out too. Let me take this on and over the course of a (hopefully) snowy weekend, I can make some more headway on this and report back.

@vitalca
Copy link

vitalca commented Jan 26, 2025

Here's the description of the format, @tstack.

Please note, that some of the fields are optional. They're not shown by default.

Spring Boot is the de facto standard for all enterprise Java development nowadays.

https://docs.spring.io/spring-boot/reference/features/logging.html#features.logging.log-format

@MakotoTheKnight
Copy link
Author

It's taken a while, and I do apologize for my absence on this, but I have something that works in my environment.

The basic thing I did was ignore the extra fields...since they're optional, not guaranteed to be there, and finicky to deal with in Regex altogether. Ultimately this may come back to bite me but for what I want to use lnav for now, this works with some tests that also help reinforce that.

Thanks for the kick-start with the timestamp piece!

"spring1": {
    "pattern": "(?<timestamp>\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d{3}?(?:Z|(?:-|\\+)\\d{2}:?\\d{2}))\\s+(?<level>ERROR|WARN|INFO|DEBUG|TRACE)\\s+(?<pid>\\d+)\\s+?---\\s+\\[(?<thread>.*?)\\].*?(?<class>[^: ]+\\s+:)\\s+(?<body>.*)"
}

Tests here:

{
                "line": "2025-01-14T16:10:44.175-07:00  INFO 93649 --- [Course Creation] [         task-1] [6618a4de8f36bf71c60b2982181e6d36-c66827472589372a] e.c.o.s.c.c.ConsumerService              : Received new unassociated section CanvasSection(course=ABCD1234, term=2251, sectionSourcedId=123456-01-2251-B-100-12345, title=A test title, section=null, sessionCode=null, activityType=LEC, classNumber=null, academicOrgCode=null, startDate=2025-01-11, endDate=2025-05-09, enrollments=[CanvasEnrollment(emplid=123456, role=PI, sectionSourcedId=123456-01-2251-B-100-12345)])"
            },
            {
                "line": "2025-01-23T11:42:36.681Z  INFO 125873 --- [myapp] [           main] o.s.b.d.f.logexample.MyApplication       : Starting MyApplication using Java 17.0.14 with PID 125873 (/opt/apps/myapp.jar started by myuser in /opt/apps/)"
            },
            {
                "line": "2025-01-23T11:42:36.681Z  INFO 125873 --- [           main] o.s.b.d.f.logexample.MyApplication       : Starting MyApplication using Java 17.0.14 with PID 125873 (/opt/apps/myapp.jar started by myuser in /opt/apps/)"
            }

@vitalca
Copy link

vitalca commented Jan 30, 2025

Hi @MakotoTheKnight, I am afraid your pattern is not complete. Please follow the link in my previous message and try to open in lnav the example log from the official Spring Boot documentation.

The highlighted fields in the screenshot below are optional.

  • In my experience (15 years of working with Spring), the Application name is more likely to be set than not. In the official example log you can see it set. Better add an optional regexp group for it.

  • Regarding the Application group, it was recently introduced in the latest major release Spring Boot 3.4. I haven't seen anyone using it yet, especially in production. I don't think it makes sense for the plain text log files we're talking about, but it could be quite useful for structured logging (JSON), which has also been introduced recently and is intended for cloud-native centralized logging. So I would just skip it for now.

  • The last one, Correlation ID, was added in Spring Boot 3.2.x. It only appears when Micrometer Tracing is explicitly enabled. It's rather a special case as it instruments the bytecode, and isn't supposed to be on by default in production. I would ignore this field.


Image

@MakotoTheKnight
Copy link
Author

@vitalca , I'm not clear what you mean by the regex not being "complete", since two of the three examples I place in the tests block in my response are directly referenced from that example page.

My goal is not to comprehensively capture every group with this initial pass, it's to ensure that Spring Boot's logs are not captured as generic_log, which they would be without this.

In spite of the fact that the application name is generally provided, it is not mandatory, and getting an optional regex for that to work reliably is a lot more finicky than it should be. The reason for that is how the groups are defined: the application name, application group, thread name and correlation ID are all captured in square brackets, and if one optional field is omitted but others are not, then that could mean that we're mis-capturing things. How would we be able to tell the difference between logs that present just a thread name and a correlation ID? (I have a use case like that so I tend to get burned by it)

If you want to add on to the regex I've shared that can satisfy those additional requirements, then I'd encourage you to do so. For now though, I'd conjecture that this gets us just enough to where we need to start to then build on/improve on things.

Or, maybe some more advanced work with scripts and reading the logs in after they land in SQLite might be more appropriate.

@vitalca
Copy link

vitalca commented Jan 30, 2025

@MakotoTheKnight,

I'm not clear what you mean by the regex not being "complete"

I mean that I took your regexp and it does not match the example log from the documentation.

Image

  1. myapp is the Application name, not thread;
  2. \\[(?<thread>.*?)\\] does not ignore the leading spaces. In real life a thread name looks like [ main];
  3. It should not take the colon. And so on...

Of course, we can't cover all possible combinations, but at least the most common cases for the majority of lnav users.

If you want to add on to the regex I've shared that can satisfy those additional requirements, then I'd encourage you to do so

There you go:

"spring1": {
    "pattern": "^(?<timestamp>[0-9-]{7,10}T[0-9.:]{2,12}(Z|[+-][0-9:]{2,5}))\\s+(?<level>[A-Z]{4,5})\\s+(?<pid>\\d+)\\s+---(?:\\s+\\[\\s*(?<app>[^\\]]+)\\])?\\s+\\[\\s*(?<thread>[^\\]]+)]\\s+(?<class>\\S+)\\s+:\\s*(?<body>.*)$"
}

Image

@vitalca
Copy link

vitalca commented Jan 30, 2025

How would we be able to tell the difference between logs that present just a thread name and a correlation ID? (I have a use case like that so I tend to get burned by it)

I think you can just add custom rules to your local config for your own cases.
No need to overcomplicate things and share it with other users, as too many rules will make lnav slow to start.

@vitalca
Copy link

vitalca commented Jan 30, 2025

As to this pattern in my solution,

(?<timestamp>[0-9-]{7,10}T[0-9.:]{2,12}(Z|[+-][0-9:]{2,5}))

According to Wiki,

ISO 8601 allows the following date formats - [0-9-]{7,10}:

YYYY-MM-DD
YYYYMMDD
YYYY-MM

and the following time formats - [0-9.:]{2,12}:

hh:mm:ss.sss
hh:mm:ss
hh:mm.mmm
hh:mm
hh.hhh

hhmmss.sss
hhmmss
hhmm.mmm
hhmm
hh

and the following zone designators - Z|[+-][0-9:]{2,5}:

Z
±hh:mm
±hhmm
±hh

@tstack
Copy link
Owner

tstack commented Jan 30, 2025

Please note, that some of the fields are optional. They're not shown by default.

Ah, yes, I've made the correlation ID optional in this commit: 4d624f5

I'd need some example logs where the app group is set.

Sorry for not getting back, I got obsessed with some performance improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants