{"value":"While working with an AWS service team to diagnose unexpected Garbage Collection (GC) pauses, the [Amazon Corretto](https://aws.amazon.com/corretto) team discovered that applications were being throttled by I/O blocking while writing logs to disk. As a result, we decided to implement and contribute async-logging([JDK-8229517](https://bugs.openjdk.java.net/browse/JDK-8229517))) to OpenJDK 17. In this post, we’re going to explain how you can use the -Xlog:async switch and the Unified Logging (UL) framework to avoid extended GC pauses due to I/O. Subsequent to our changes, the service improved their outlier p99.99 GC pause latency from 3s to 1s.\n\nUL is a logging framework for the Java Virtual Machine (JVM) introduced in OpenJDK 9. Many AWS services detect anomalies and subsequently take action based upon logging output from running JVMs. Logs also provide abundant details about Garbage Collection (GC) activities, helping developers to pinpoint the cause of long pauses and tune GC performance based on them.\n\nUL is flexible in that you can change its configuration at runtime. OpenJDK users can reduce their logging overhead and save disk space with a default, terse configuration, but can dynamically increase logging output when required. For example, a developer can map an alarm trigger to code that will ask the JVM to increase logging context and verbosity when the monitored JVM crosses a given threshold. Some AWS services use this pattern to reduce default operational loads.\n\nYou can access these features using any OpenJDK distro, including Amazon’s [Corretto 17](https://github.com/corretto/corretto-17). To understand how this works, let’s dive into details.\n\n\n### **JVM Unified Logging**\n\n\nUnified Logging (UL) is a dynamic, tag-based logging framework for the HotSpot JVM. UL configuration is a mini language unto itself. OpenJDK users will find a formal syntax in [JEP-158](https://openjdk.java.net/jeps/158) and a help message using```-Xlog:help```, but we’ll provide the basic concepts here.\n\nAs of OpenJDK 17, developers can access hundreds of log **tags** that they can use to identify logging data output. Some examples of the available tag names are class, compilation, gc, metadata, and stats. These represent the object of the logging information. You can find a full list of these by running the help command noted above.\n\nNext, you can group these tags, so that you can adjust the logging levels associated with multiple logging tags simultaneously. We call this grouping a **tagset**. A developer will likely see the value in being able to turn up the logging level for a number of related tags.\n\nThen, we have the logging instrumentation. Let’s call these “**log sites**”. Each log site is logically associated with one tagset. Finally, we have a **log selector**, which is a query string you can use to match from zero to multiple tagsets along with their verbosity levels. It’s the log selector that you can use to filter the logging output.\n\nHere an example, the log message are obtained from ```java -Xlog:'gc*=info:stdout'```\n\n```[0.030s][info][gc,init] CardTable entry size: 512\n[0.050s][info][gc ] Using G1\n(redacted)\n[0.068s][info][gc,init] Concurrent Workers: 7\n...\n[0.068s][info][gc,metaspace] Narrow klass base: 0x0000000800000000, Narrow klass shift: 0, Narrow klass range: 0x100000000\n...\n\n```\nThe first log message comes from the log site below; its tagset is (gc,init). The code will only be executed when the log is selected.\n\n```\nlog_info_p(gc, init)(\"CardTable entry size: \" UINT32_FORMAT, _card_size);\n\n```\n\nThe second log message has a logset ‘gc’.\n\n```\nlog_info(gc)(\"Using %s\", _collectedHeap→name());\n\n```\n‘gc=info’ only selects logsites with the tagset ‘gc’. By contrast, the wildcard is used to match all tagsets. ```'gc*'```will select tagset ‘gc’, ‘gc+init’ and all others with prefix gc. The log selector```gc*=info'```in the example breaks down into 2 parts: query things ‘gc*’ and verbosity level ‘info’.\n\nThis blog post is not a thorough tutorial of UL. Readers who are not familiar with it can read [JEP-158 ](https://openjdk.java.net/jeps/158)or this [tutorial](https://blog.arkey.fr/2020/07/28/embracing-jvm-unified-logging-jep-158-jep-271/). Since OpenJDK 9, there’s been no separate GC log: it was integrated into UL. If you’ve used OpenJDK 8, you may have added ```XX:+PrintGC```to your command line. Since OpenJDK 9, you’ve been able to select GC logs using the “gc” tag in combination with others. More details can be found in [JEP-271](https://blog.arkey.fr/2020/07/28/embracing-jvm-unified-logging-jep-158-jep-271/).\n\n```\n-Xlog:\n gc*=debug, \n :file=/mygclogs/%t-gc.log:uptime,tags,level:filecount=10,filesize=10M\n```\n\n**Asynchronous logging**\n\n\nYou can direct output UL to a file. Even though file I/O is buffered, log writing still cannot be guaranteed to be non-blocking because the ‘write()’ syscall depends on the implementation of a specific filesystem and the underlying physical device. E.g., Linux software RAID needs to synchronize multiple writes to different disk partitions, and a network-backed Virtual File System (VFS) may block while writing due to a slow network connection.\n\nThe HotSpot JVM has a global synchronization mechanism called a ‘**safepoint**’. At a safepoint, HotSpot forces all Java threads to pause. Some GC-related VM operations require a safepoint to ensure that Java threads do not mutate the Java heap while GC is running. If the UL writes to files during a safepoint and the writes happen to be delayed or blocked, then the safepoint pause time for such VM operations will be prolonged. These low-level incidents will increase application response time.\n\nOne pattern that should draw your attention to potential I/O problems is an unusual CPU time distribution. The ```gc,gc+cpu```tagset reveals such information. In the example below, we ran the Java application with 7 concurrent GC threads. We’d expect that the “User” time associated with the thread activity would be higher than the “Real” time. However, in the log we see that the “User” time is actually smaller, which makes us suspicious. One possible explanation for this could be that some GC threads were blocked by disk I/O. Perhaps they were not in the wait queue so they would not consume CPU time.\n```\n[gc ] GC(28) Pause Young (Mixed) (G1 Evacuation Pause) 8200M->7105M(10240M) 770.84ms\n[gc,cpu ] GC(28) User=0.36s Sys=0.02s Real=0.77s\n```\n\nTo address the issue, the AWS Corretto Team developed a new feature called “Asynchronous Logging” (asynclog), and have added it to OpenJDK 17. When asynclog is enabled, logsites enqueue logging messages to an in-memory buffer and a dedicated, concurrent thread asynchronously flushes them to the specified output file. Log writes to the in-memory buffer are guaranteed to be non-blocking. By default, the intermediate buffer size is bounded to 2 MB, but configurable with ```-XX:AsyncLogBufferSize```. If the buffer fills up before the asynchronous log writer thread can dump its contents to file, new log messages will be discarded. You can use the new option ```-Xlog:async``` to tell UL to run in asynclog mode.\n\nThe following two experiments show why asynclog matters for Java applications. The first shows how it alleviates the logging penalty by leveraging concurrency. The second shows how asynclog improves high-percentile latency.\n\n**Performance impact on Java HelloWorld with full UL**\n\nIn the first experiment, we enable ```all=trace``` to produce exhaustive UL logs. This tagset covers all parts and the entire JVM lifecycle. For a typical “HelloWorld” program running on OpenJDK 17, when we enabled this option, we observed 16,219 messages for synchronous UL and 16,225 log messages for ```-Xlog:async```. The extra log messages are from the asynclog subsystem itself. Please note that ```filecount=0``` is a UL output option which prevents log file rotation. Here is the full Java command.\n\n```\njava -Xlog:async -Xlog:all=trace:file=all.log::filecount=0 HelloWorld\n\n```\n\nAs you can see in the graph, in our experiment, asynchronous logging reduced the Real CPU time by 34.5% from 108 to 70.7ms. At the same time, we saw overall CPU utilization increase from 102.7% to 171.6%. The runtime gain is achieved by offloading the I/O task of log writing into a separate, concurrent thread. While this considerably reduces the absolute running time of the application it also increases overall CPU consumption due to overhead introduced by the additional thread.\n\n![image.png](https://dev-media.amazoncloud.cn/660866bd4aa14c11864bfc06d058c0a9_image.png)\n\n\n**Impact on high-percentile GC pauses**\n\nIn a second experiment we ran the benchmark [heapothesys/HyperAlloc](https://github.com/corretto/heapothesys/tree/master/HyperAlloc) on a c5d.2xlarge instance which was backed by Hard Disk Drives (HDD), which are far slower than SSD. To make the latency more obvious, we used a background script to ensure disk utilization was close to 100%. We selected all GC tagsets with ‘debug’ verbosity. Our test program created four threads with an allocation rate of 1024M/s for 180 seconds and created a ```gc.log``` output file of about 4MB.\n\n```\njava -Xlog:async -Xlog:'gc*=debug:file=gc.log::filecount=0' -javaagent:jHiccup.jar='-a -d 10000 -i 1000 -l asynclog.log' -jar HyperAlloc-1.0.jar -d 180\n```\n\nThe data show how G1 GC latency is affected by disk I/O at p99.9 when GC logs are written synchronously. Some latency outliers (red spikes), aka [Hiccups](Hiccups), make the high-percentile GC pause time skyrocket. Asynclog is effective at curbing the impact of disk I/O, so even at p99.999 the latency remains below 15ms.\n\n![image.png](https://dev-media.amazoncloud.cn/f4b077d15ef24f2e95a410f78a7a9991_image.png)\n\n![image.png](https://dev-media.amazoncloud.cn/62e8352bdfdc47649648031e80746318_image.png)\n\n#### **Dynamic Configuration**\n\nOpenJDK users can change UL configuration on the fly even if they do not set up UL arguments on the Java command line. ```jcmd``` is a standalone tool which sends diagnostic commands (dcmds) to HotSpot. HotSpot has a thread called AttachListener, which listens on a local socket and processes incoming dcmds. For UL, the ```VM.log``` dcmd can display the current UL settings and change them. The following table shows the arguments of VM.log.\n\n![image.png](https://dev-media.amazoncloud.cn/3f1b44ae779b4d81b9b997b59c13fb1c_image.png)\n\nAssuming you have a Java application started whose PID is 85254. You can use VM.log to add a new log output.\n\n```\n$jcmd 85254 VM.log output=gc.log what=gc=debug decorators=uptime,pid,tags,level\n85254:\nCommand executed successfully\n\n```\n```gc=debug``` instructs UL to select log entries with the tag ```gc``` and a verbosity that is equal or higher than ```debug```. The output goes to the file ```gc.log```. When this dcmd completes, you will start observing the GC log in the file. You can use the ```list``` sub-command to verify that the log output has been updated.\n\n```\n$jcmd 85254 VM.log list\nLog output configuration:\n#0: stdout all=warning,gc=debug uptime,pid,level,tags (reconfigured)\n#1: stderr all=off uptime,level,tags\n#2: file=gc.log all=off,gc=debug uptime,pid,level,tags filecount=0,filesize=20480K,async=false (reconfigured)\n```\nJava users can increase the log verbosity or expand the tagset if more information is needed. E.g.,\n\n```\n$jcmd 85254 VM.log output=#2 what='gc*=trace'\n```\n```output=#2``` refers to the prior output file ```gc.log```. ```gc*``` is a log selector with a wildcard. It matches all tagsets with ```gc```, which is broader than before. Meanwhile, the verbosity level has increased from ```debug``` to ```trace```. Here is a sample of logs after this adjustment.\n```\n\n[34.844s][debug][gc,heap ] GC(90) Heap after GC invocations=91 (full 0):\n[34.844s][debug][gc,heap ] GC(90) garbage-first heap total 10485760K, used 6873044K [0x0000000580000000, 0x0000000800000000)\n[34.844s][debug][gc,heap ] GC(90) region size 8192K, 14 young (114688K), 14 survivors (114688K)\n[34.844s][debug][gc,heap ] GC(90) Metaspace used 4685K, committed 4864K, reserved 1056768K\n[34.844s][debug][gc,heap ] GC(90) class space used 505K, committed 640K, reserved 1048576K\n[34.844s][info ][gc ] GC(90) Pause Young (Normal) (G1 Evacuation Pause) 7255M->6711M(10240M) 185.667ms\n[34.844s][info ][gc,cpu ] GC(90) User=0.19s Sys=0.25s Real=0.18s\n[34.844s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000007fe800000, 0x00000007fe800000, 0x00000007ff000000]\n[34.844s][trace][gc,task ] G1 Service Thread (Remembered Set Sampling Task) (schedule) @35.130s\n[34.844s][trace][gc,alloc ] Thread-0: Successfully scheduled collection returning 0x00000007fe800000\n[34.844s][debug][gc,task ] G1 Service Thread (Remembered Set Sampling Task) (run) 82.982ms (cpu: 82.979ms)\n[34.844s][trace][gc,tlab ] ThreadLocalAllocBuffer::compute_size(3) returns 524288\n[34.844s][debug][gc,task,start ] G1 Service Thread (Periodic GC Task) (run)\n[34.844s][trace][gc,tlab ] TLAB: fill thread: 0x00007fd24c016c00 [id: 38147] desired_size: 4096KB slow allocs: 0 refill waste: 65536B alloc: 0.99843 8179KB refills: 1 waste 0.0% gc: 0B slow: 0B\n```\nava developers can also control UL programmatically right from their application because the dcmd functionality has been exposed as a MXBean. First of all, Java applications need to enable JMX.\n\n```\n-Dcom.sun.management.jmxremote.port=9999 \\\n-Dcom.sun.management.jmxremote.authenticate=false \\\n-Dcom.sun.management.jmxremote.ssl=false\n\n```\nThe following procedure reconfigures output #2 via JMX. The effect is same as the prior jcmd command. Please note that authentication and SSL are ignored for simplicity. That is not a good practice, but this is a [demo](https://github.com/navyxliu/JavaGCworkload/blob/master/CrankUpGCLog.java).\n\n```\nimport javax.management.MBeanServerConnection;\nimport javax.management.ObjectName;\nimport javax.management.remote.*;\n\npublic class CrankUpGCLog {\n public static void main(String[] args) throws Exception {\n JMXServiceURL url =\n new JMXServiceURL(\"service:jmx:rmi:///jndi/rmi://:9999/jmxrmi\");\n JMXConnector jmxc = JMXConnectorFactory.connect(url, null);\n MBeanServerConnection mbsc = jmxc.getMBeanServerConnection();\n ObjectName mxbean = new ObjectName(\"com.sun.management:type=DiagnosticCommand\");\n```\n#### **Testing it yourself**\n\nIf you’d like to experiment yourself with Async logging, you can follow the instructions below. Please ensure you have [Corretto 17](https://aws.amazon.com/corretto/?filtered-posts.sort-by=item.additionalFields.createdDate&filtered-posts.sort-order=desc) or later installed, as the Asynchronous logging feature was added in Corretto 17.\n```\n\njava -version\n\nopenjdk version \"17.0.1\" 2021-10-19 LTS\nOpenJDK Runtime Environment Corretto-17.0.1.12.1 (build 17.0.1+12-LTS)\nOpenJDK 64-Bit Server VM Corretto-17.0.1.12.1 (build 17.0.1+12-LTS, mixed mode, sharing)\n```\n**Now write HelloWorld Java Application in HelloWorld.java**\n\n```\n$ javac HelloWorld.java \n```\n\n**Windows (In Powershell, create a time function to measure the elapsed time it takes to execute the application)**\n\n```\n#Create the following function in Windows Powershell\n#Don't do this for Linux. The 'time' function is already built in\nPS c:\\Users\\your_name>function time { $Command = \"$args\"; Measure-Command { Invoke-Expression $Command 2>&1 | out-default} }\n```\n**On MacOS, Linux, and Windows (Using Powershell)**\n\nPlease note that the output below was obtained via Windows Powershell.\n\n```\n# Test the Application without using Async Logging\n# You can run it a few times to get an average\n# NOTE: in Windows Powershell, pass in the parameters in single quotes \n# Please remove the single quotes for Linux implementations\n\n PS C:\\Users\\Administrator> time java '-Xlog:all=trace:file=hotspot.log.1:l,tg:filecount=0' HelloWorld\n Hello World!\n\n\nDays : 0\nHours : 0\nMinutes : 0\nSeconds : 0\nMilliseconds : 514\nTicks : 5149368\nTotalDays : 5.95991666666667E-06\nTotalHours : 0.000143038\nTotalMinutes : 0.00858228\nTotalSeconds : 0.5149368\nTotalMilliseconds : 514.9368\n \n \n# Now Test the Application with Async Logging\n# NOTE: In Windows Powershell, pass in the parameters in single quotes \n# Please remove the single quotes for Linux implementations\nPS C:\\Users\\Administrator>time java '-Xlog:async -XX:AsyncLogBufferSize=4M -Xlog:all=trace:file=hotspot-async.log:l,tg:filecount=0' HelloWorld\n Hello World!\n\n\nDays : 0\nHours : 0\nMinutes : 0\nSeconds : 0\nMilliseconds : 401\nTicks : 4015190\nTotalDays : 4.64721064814815E-06\nTotalHours : 0.000111533055555556\nTotalMinutes : 0.00669198333333333\nTotalSeconds : 0.401519\nTotalMilliseconds : 401.519\n\n```\n\nYou’ll want to run the above experiment a number of times to get a good average.\n\n### **Conclusion**\n\nThis blog introduced a new Unified Logging features in OpenJDK and [Corretto 17](https://aws.amazon.com/corretto). Asynchronous logging reduces application pauses due to UL by decoupling logging from disk I/O. Dynamic configuration, which has been available since OpenJDK 9, provides true on-demand logging to avoid the constant overhead of UL until it is really needed. They are orthogonal, so Java users can use them independently.\n\n#### **About the authors**\n\n![image.png](https://dev-media.amazoncloud.cn/c0f601f72bae457f964a8336ff8a849c_image.png)\n\n**Xin Liu**\nXin Liu is a Senior Software Engineer focused on the Corretto Java Development Kit. He has a passion for improving Corretto and OpenJDK. He is located in the Seattle area. You can find him on Twitter at @navyasm.\n\n![image.png](https://dev-media.amazoncloud.cn/0c7a087b64774245aaa704703f835410_image.png)\n\n**Mike Cook**\nMike Cook is a Principal Product Manager focused on Corretto. He would like to improve the Java Developer and Operations experiences. He is located in New Jersey. You can find him on Twitter at @correttoMike.","render":"<p>While working with an AWS service team to diagnose unexpected Garbage Collection (GC) pauses, the <a href=\"https://aws.amazon.com/corretto\" target=\"_blank\">Amazon Corretto</a> team discovered that applications were being throttled by I/O blocking while writing logs to disk. As a result, we decided to implement and contribute async-logging(<a href=\"https://bugs.openjdk.java.net/browse/JDK-8229517\" target=\"_blank\">JDK-8229517</a>)) to OpenJDK 17. In this post, we’re going to explain how you can use the -Xlog:async switch and the Unified Logging (UL) framework to avoid extended GC pauses due to I/O. Subsequent to our changes, the service improved their outlier p99.99 GC pause latency from 3s to 1s.</p>\n<p>UL is a logging framework for the Java Virtual Machine (JVM) introduced in OpenJDK 9. Many AWS services detect anomalies and subsequently take action based upon logging output from running JVMs. Logs also provide abundant details about Garbage Collection (GC) activities, helping developers to pinpoint the cause of long pauses and tune GC performance based on them.</p>\n<p>UL is flexible in that you can change its configuration at runtime. OpenJDK users can reduce their logging overhead and save disk space with a default, terse configuration, but can dynamically increase logging output when required. For example, a developer can map an alarm trigger to code that will ask the JVM to increase logging context and verbosity when the monitored JVM crosses a given threshold. Some AWS services use this pattern to reduce default operational loads.</p>\n<p>You can access these features using any OpenJDK distro, including Amazon’s <a href=\"https://github.com/corretto/corretto-17\" target=\"_blank\">Corretto 17</a>. To understand how this works, let’s dive into details.</p>\n<h3><a id=\"JVM_Unified_Logging_9\"></a><strong>JVM Unified Logging</strong></h3>\n<p>Unified Logging (UL) is a dynamic, tag-based logging framework for the HotSpot JVM. UL configuration is a mini language unto itself. OpenJDK users will find a formal syntax in <a href=\"https://openjdk.java.net/jeps/158\" target=\"_blank\">JEP-158</a> and a help message using<code>-Xlog:help</code>, but we’ll provide the basic concepts here.</p>\n<p>As of OpenJDK 17, developers can access hundreds of log <strong>tags</strong> that they can use to identify logging data output. Some examples of the available tag names are class, compilation, gc, metadata, and stats. These represent the object of the logging information. You can find a full list of these by running the help command noted above.</p>\n<p>Next, you can group these tags, so that you can adjust the logging levels associated with multiple logging tags simultaneously. We call this grouping a <strong>tagset</strong>. A developer will likely see the value in being able to turn up the logging level for a number of related tags.</p>\n<p>Then, we have the logging instrumentation. Let’s call these “<strong>log sites</strong>”. Each log site is logically associated with one tagset. Finally, we have a <strong>log selector</strong>, which is a query string you can use to match from zero to multiple tagsets along with their verbosity levels. It’s the log selector that you can use to filter the logging output.</p>\n<p>Here an example, the log message are obtained from <code>java -Xlog:'gc*=info:stdout'</code></p>\n<pre><code class=\"lang-[0.030s][info][gc,init]\">[0.050s][info][gc ] Using G1\n(redacted)\n[0.068s][info][gc,init] Concurrent Workers: 7\n...\n[0.068s][info][gc,metaspace] Narrow klass base: 0x0000000800000000, Narrow klass shift: 0, Narrow klass range: 0x100000000\n...\n\n</code></pre>\n<p>The first log message comes from the log site below; its tagset is (gc,init). The code will only be executed when the log is selected.</p>\n<pre><code class=\"lang-\">log_info_p(gc, init)("CardTable entry size: " UINT32_FORMAT, _card_size);\n\n</code></pre>\n<p>The second log message has a logset ‘gc’.</p>\n<pre><code class=\"lang-\">log_info(gc)("Using %s", _collectedHeap→name());\n\n</code></pre>\n<p>‘gc=info’ only selects logsites with the tagset ‘gc’. By contrast, the wildcard is used to match all tagsets. <code>'gc*'</code>will select tagset ‘gc’, ‘gc+init’ and all others with prefix gc. The log selector<code>gc*=info'</code>in the example breaks down into 2 parts: query things ‘gc*’ and verbosity level ‘info’.</p>\n<p>This blog post is not a thorough tutorial of UL. Readers who are not familiar with it can read <a href=\"https://openjdk.java.net/jeps/158\" target=\"_blank\">JEP-158 </a>or this <a href=\"https://blog.arkey.fr/2020/07/28/embracing-jvm-unified-logging-jep-158-jep-271/\" target=\"_blank\">tutorial</a>. Since OpenJDK 9, there’s been no separate GC log: it was integrated into UL. If you’ve used OpenJDK 8, you may have added <code>XX:+PrintGC</code>to your command line. Since OpenJDK 9, you’ve been able to select GC logs using the “gc” tag in combination with others. More details can be found in <a href=\"https://blog.arkey.fr/2020/07/28/embracing-jvm-unified-logging-jep-158-jep-271/\" target=\"_blank\">JEP-271</a>.</p>\n<pre><code class=\"lang-\">-Xlog:\n gc*=debug, \n :file=/mygclogs/%t-gc.log:uptime,tags,level:filecount=10,filesize=10M\n</code></pre>\n<p><strong>Asynchronous logging</strong></p>\n<p>You can direct output UL to a file. Even though file I/O is buffered, log writing still cannot be guaranteed to be non-blocking because the ‘write()’ syscall depends on the implementation of a specific filesystem and the underlying physical device. E.g., Linux software RAID needs to synchronize multiple writes to different disk partitions, and a network-backed Virtual File System (VFS) may block while writing due to a slow network connection.</p>\n<p>The HotSpot JVM has a global synchronization mechanism called a ‘<strong>safepoint</strong>’. At a safepoint, HotSpot forces all Java threads to pause. Some GC-related VM operations require a safepoint to ensure that Java threads do not mutate the Java heap while GC is running. If the UL writes to files during a safepoint and the writes happen to be delayed or blocked, then the safepoint pause time for such VM operations will be prolonged. These low-level incidents will increase application response time.</p>\n<p>One pattern that should draw your attention to potential I/O problems is an unusual CPU time distribution. The <code>gc,gc+cpu</code>tagset reveals such information. In the example below, we ran the Java application with 7 concurrent GC threads. We’d expect that the “User” time associated with the thread activity would be higher than the “Real” time. However, in the log we see that the “User” time is actually smaller, which makes us suspicious. One possible explanation for this could be that some GC threads were blocked by disk I/O. Perhaps they were not in the wait queue so they would not consume CPU time.</p>\n<pre><code class=\"lang-\">[gc ] GC(28) Pause Young (Mixed) (G1 Evacuation Pause) 8200M->7105M(10240M) 770.84ms\n[gc,cpu ] GC(28) User=0.36s Sys=0.02s Real=0.77s\n</code></pre>\n<p>To address the issue, the AWS Corretto Team developed a new feature called “Asynchronous Logging” (asynclog), and have added it to OpenJDK 17. When asynclog is enabled, logsites enqueue logging messages to an in-memory buffer and a dedicated, concurrent thread asynchronously flushes them to the specified output file. Log writes to the in-memory buffer are guaranteed to be non-blocking. By default, the intermediate buffer size is bounded to 2 MB, but configurable with <code>-XX:AsyncLogBufferSize</code>. If the buffer fills up before the asynchronous log writer thread can dump its contents to file, new log messages will be discarded. You can use the new option <code>-Xlog:async</code> to tell UL to run in asynclog mode.</p>\n<p>The following two experiments show why asynclog matters for Java applications. The first shows how it alleviates the logging penalty by leveraging concurrency. The second shows how asynclog improves high-percentile latency.</p>\n<p><strong>Performance impact on Java HelloWorld with full UL</strong></p>\n<p>In the first experiment, we enable <code>all=trace</code> to produce exhaustive UL logs. This tagset covers all parts and the entire JVM lifecycle. For a typical “HelloWorld” program running on OpenJDK 17, when we enabled this option, we observed 16,219 messages for synchronous UL and 16,225 log messages for <code>-Xlog:async</code>. The extra log messages are from the asynclog subsystem itself. Please note that <code>filecount=0</code> is a UL output option which prevents log file rotation. Here is the full Java command.</p>\n<pre><code class=\"lang-\">java -Xlog:async -Xlog:all=trace:file=all.log::filecount=0 HelloWorld\n\n</code></pre>\n<p>As you can see in the graph, in our experiment, asynchronous logging reduced the Real CPU time by 34.5% from 108 to 70.7ms. At the same time, we saw overall CPU utilization increase from 102.7% to 171.6%. The runtime gain is achieved by offloading the I/O task of log writing into a separate, concurrent thread. While this considerably reduces the absolute running time of the application it also increases overall CPU consumption due to overhead introduced by the additional thread.</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/660866bd4aa14c11864bfc06d058c0a9_image.png\" alt=\"image.png\" /></p>\n<p><strong>Impact on high-percentile GC pauses</strong></p>\n<p>In a second experiment we ran the benchmark <a href=\"https://github.com/corretto/heapothesys/tree/master/HyperAlloc\" target=\"_blank\">heapothesys/HyperAlloc</a> on a c5d.2xlarge instance which was backed by Hard Disk Drives (HDD), which are far slower than SSD. To make the latency more obvious, we used a background script to ensure disk utilization was close to 100%. We selected all GC tagsets with ‘debug’ verbosity. Our test program created four threads with an allocation rate of 1024M/s for 180 seconds and created a <code>gc.log</code> output file of about 4MB.</p>\n<pre><code class=\"lang-\">java -Xlog:async -Xlog:'gc*=debug:file=gc.log::filecount=0' -javaagent:jHiccup.jar='-a -d 10000 -i 1000 -l asynclog.log' -jar HyperAlloc-1.0.jar -d 180\n</code></pre>\n<p>The data show how G1 GC latency is affected by disk I/O at p99.9 when GC logs are written synchronously. Some latency outliers (red spikes), aka <a href=\"Hiccups\" target=\"_blank\">Hiccups</a>, make the high-percentile GC pause time skyrocket. Asynclog is effective at curbing the impact of disk I/O, so even at p99.999 the latency remains below 15ms.</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/f4b077d15ef24f2e95a410f78a7a9991_image.png\" alt=\"image.png\" /></p>\n<p><img src=\"https://dev-media.amazoncloud.cn/62e8352bdfdc47649648031e80746318_image.png\" alt=\"image.png\" /></p>\n<h4><a id=\"Dynamic_Configuration_99\"></a><strong>Dynamic Configuration</strong></h4>\n<p>OpenJDK users can change UL configuration on the fly even if they do not set up UL arguments on the Java command line. <code>jcmd</code> is a standalone tool which sends diagnostic commands (dcmds) to HotSpot. HotSpot has a thread called AttachListener, which listens on a local socket and processes incoming dcmds. For UL, the <code>VM.log</code> dcmd can display the current UL settings and change them. The following table shows the arguments of VM.log.</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/3f1b44ae779b4d81b9b997b59c13fb1c_image.png\" alt=\"image.png\" /></p>\n<p>Assuming you have a Java application started whose PID is 85254. You can use VM.log to add a new log output.</p>\n<pre><code class=\"lang-\">$jcmd 85254 VM.log output=gc.log what=gc=debug decorators=uptime,pid,tags,level\n85254:\nCommand executed successfully\n\n</code></pre>\n<p><code>gc=debug</code> instructs UL to select log entries with the tag <code>gc</code> and a verbosity that is equal or higher than <code>debug</code>. The output goes to the file <code>gc.log</code>. When this dcmd completes, you will start observing the GC log in the file. You can use the <code>list</code> sub-command to verify that the log output has been updated.</p>\n<pre><code class=\"lang-\">$jcmd 85254 VM.log list\nLog output configuration:\n#0: stdout all=warning,gc=debug uptime,pid,level,tags (reconfigured)\n#1: stderr all=off uptime,level,tags\n#2: file=gc.log all=off,gc=debug uptime,pid,level,tags filecount=0,filesize=20480K,async=false (reconfigured)\n</code></pre>\n<p>Java users can increase the log verbosity or expand the tagset if more information is needed. E.g.,</p>\n<pre><code class=\"lang-\">$jcmd 85254 VM.log output=#2 what='gc*=trace'\n</code></pre>\n<p><code>output=#2</code> refers to the prior output file <code>gc.log</code>. <code>gc*</code> is a log selector with a wildcard. It matches all tagsets with <code>gc</code>, which is broader than before. Meanwhile, the verbosity level has increased from <code>debug</code> to <code>trace</code>. Here is a sample of logs after this adjustment.</p>\n<pre><code class=\"lang-\">\n[34.844s][debug][gc,heap ] GC(90) Heap after GC invocations=91 (full 0):\n[34.844s][debug][gc,heap ] GC(90) garbage-first heap total 10485760K, used 6873044K [0x0000000580000000, 0x0000000800000000)\n[34.844s][debug][gc,heap ] GC(90) region size 8192K, 14 young (114688K), 14 survivors (114688K)\n[34.844s][debug][gc,heap ] GC(90) Metaspace used 4685K, committed 4864K, reserved 1056768K\n[34.844s][debug][gc,heap ] GC(90) class space used 505K, committed 640K, reserved 1048576K\n[34.844s][info ][gc ] GC(90) Pause Young (Normal) (G1 Evacuation Pause) 7255M->6711M(10240M) 185.667ms\n[34.844s][info ][gc,cpu ] GC(90) User=0.19s Sys=0.25s Real=0.18s\n[34.844s][trace][gc,region ] G1HR ALLOC(EDEN) [0x00000007fe800000, 0x00000007fe800000, 0x00000007ff000000]\n[34.844s][trace][gc,task ] G1 Service Thread (Remembered Set Sampling Task) (schedule) @35.130s\n[34.844s][trace][gc,alloc ] Thread-0: Successfully scheduled collection returning 0x00000007fe800000\n[34.844s][debug][gc,task ] G1 Service Thread (Remembered Set Sampling Task) (run) 82.982ms (cpu: 82.979ms)\n[34.844s][trace][gc,tlab ] ThreadLocalAllocBuffer::compute_size(3) returns 524288\n[34.844s][debug][gc,task,start ] G1 Service Thread (Periodic GC Task) (run)\n[34.844s][trace][gc,tlab ] TLAB: fill thread: 0x00007fd24c016c00 [id: 38147] desired_size: 4096KB slow allocs: 0 refill waste: 65536B alloc: 0.99843 8179KB refills: 1 waste 0.0% gc: 0B slow: 0B\n</code></pre>\n<p>ava developers can also control UL programmatically right from their application because the dcmd functionality has been exposed as a MXBean. First of all, Java applications need to enable JMX.</p>\n<pre><code class=\"lang-\">-Dcom.sun.management.jmxremote.port=9999 \\\n-Dcom.sun.management.jmxremote.authenticate=false \\\n-Dcom.sun.management.jmxremote.ssl=false\n\n</code></pre>\n<p>The following procedure reconfigures output #2 via JMX. The effect is same as the prior jcmd command. Please note that authentication and SSL are ignored for simplicity. That is not a good practice, but this is a <a href=\"https://github.com/navyxliu/JavaGCworkload/blob/master/CrankUpGCLog.java\" target=\"_blank\">demo</a>.</p>\n<pre><code class=\"lang-\">import javax.management.MBeanServerConnection;\nimport javax.management.ObjectName;\nimport javax.management.remote.*;\n\npublic class CrankUpGCLog {\n public static void main(String[] args) throws Exception {\n JMXServiceURL url =\n new JMXServiceURL("service:jmx:rmi:///jndi/rmi://:9999/jmxrmi");\n JMXConnector jmxc = JMXConnectorFactory.connect(url, null);\n MBeanServerConnection mbsc = jmxc.getMBeanServerConnection();\n ObjectName mxbean = new ObjectName("com.sun.management:type=DiagnosticCommand");\n</code></pre>\n<h4><a id=\"Testing_it_yourself_168\"></a><strong>Testing it yourself</strong></h4>\n<p>If you’d like to experiment yourself with Async logging, you can follow the instructions below. Please ensure you have <a href=\"https://aws.amazon.com/corretto/?filtered-posts.sort-by=item.additionalFields.createdDate&filtered-posts.sort-order=desc\" target=\"_blank\">Corretto 17</a> or later installed, as the Asynchronous logging feature was added in Corretto 17.</p>\n<pre><code class=\"lang-\">\njava -version\n\nopenjdk version "17.0.1" 2021-10-19 LTS\nOpenJDK Runtime Environment Corretto-17.0.1.12.1 (build 17.0.1+12-LTS)\nOpenJDK 64-Bit Server VM Corretto-17.0.1.12.1 (build 17.0.1+12-LTS, mixed mode, sharing)\n</code></pre>\n<p><strong>Now write HelloWorld Java Application in HelloWorld.java</strong></p>\n<pre><code class=\"lang-\">$ javac HelloWorld.java \n</code></pre>\n<p><strong>Windows (In Powershell, create a time function to measure the elapsed time it takes to execute the application)</strong></p>\n<pre><code class=\"lang-\">#Create the following function in Windows Powershell\n#Don't do this for Linux. The 'time' function is already built in\nPS c:\\Users\\your_name>function time { $Command = "$args"; Measure-Command { Invoke-Expression $Command 2>&1 | out-default} }\n</code></pre>\n<p><strong>On MacOS, Linux, and Windows (Using Powershell)</strong></p>\n<p>Please note that the output below was obtained via Windows Powershell.</p>\n<pre><code class=\"lang-\"># Test the Application without using Async Logging\n# You can run it a few times to get an average\n# NOTE: in Windows Powershell, pass in the parameters in single quotes \n# Please remove the single quotes for Linux implementations\n\n PS C:\\Users\\Administrator> time java '-Xlog:all=trace:file=hotspot.log.1:l,tg:filecount=0' HelloWorld\n Hello World!\n\n\nDays : 0\nHours : 0\nMinutes : 0\nSeconds : 0\nMilliseconds : 514\nTicks : 5149368\nTotalDays : 5.95991666666667E-06\nTotalHours : 0.000143038\nTotalMinutes : 0.00858228\nTotalSeconds : 0.5149368\nTotalMilliseconds : 514.9368\n \n \n# Now Test the Application with Async Logging\n# NOTE: In Windows Powershell, pass in the parameters in single quotes \n# Please remove the single quotes for Linux implementations\nPS C:\\Users\\Administrator>time java '-Xlog:async -XX:AsyncLogBufferSize=4M -Xlog:all=trace:file=hotspot-async.log:l,tg:filecount=0' HelloWorld\n Hello World!\n\n\nDays : 0\nHours : 0\nMinutes : 0\nSeconds : 0\nMilliseconds : 401\nTicks : 4015190\nTotalDays : 4.64721064814815E-06\nTotalHours : 0.000111533055555556\nTotalMinutes : 0.00669198333333333\nTotalSeconds : 0.401519\nTotalMilliseconds : 401.519\n\n</code></pre>\n<p>You’ll want to run the above experiment a number of times to get a good average.</p>\n<h3><a id=\"Conclusion_242\"></a><strong>Conclusion</strong></h3>\n<p>This blog introduced a new Unified Logging features in OpenJDK and <a href=\"https://aws.amazon.com/corretto\" target=\"_blank\">Corretto 17</a>. Asynchronous logging reduces application pauses due to UL by decoupling logging from disk I/O. Dynamic configuration, which has been available since OpenJDK 9, provides true on-demand logging to avoid the constant overhead of UL until it is really needed. They are orthogonal, so Java users can use them independently.</p>\n<h4><a id=\"About_the_authors_246\"></a><strong>About the authors</strong></h4>\n<p><img src=\"https://dev-media.amazoncloud.cn/c0f601f72bae457f964a8336ff8a849c_image.png\" alt=\"image.png\" /></p>\n<p><strong>Xin Liu</strong><br />\nXin Liu is a Senior Software Engineer focused on the Corretto Java Development Kit. He has a passion for improving Corretto and OpenJDK. He is located in the Seattle area. You can find him on Twitter at @navyasm.</p>\n<p><img src=\"https://dev-media.amazoncloud.cn/0c7a087b64774245aaa704703f835410_image.png\" alt=\"image.png\" /></p>\n<p><strong>Mike Cook</strong><br />\nMike Cook is a Principal Product Manager focused on Corretto. He would like to improve the Java Developer and Operations experiences. He is located in New Jersey. You can find him on Twitter at @correttoMike.</p>\n"}