Skip to content

Performance

Hermes is designed for high-performance logging with multiple optimization techniques to minimize overhead.

Performance Goals

  1. Minimal overhead for disabled log statements: <5ns
  2. Fast message formatting: Zero-allocation using ThreadLocal StringBuilder
  3. High throughput: 10M+ messages/sec with async logging
  4. Low latency: Sub-microsecond for async publish
  5. Predictable performance: No GC pressure from logging

Benchmark Results

Latency (single-threaded)

Operation Latency Notes
Level check 1-2ns if (log.isDebugEnabled())
Disabled log statement 2-5ns Early exit optimization
Enabled log (sync) 50-100ns Format + create LogEvent
Enabled log (async) 500-1000ns Publish to ring buffer
File I/O (sync) 10-100µs Depends on disk speed

Throughput (multi-threaded)

Configuration Throughput CPU Usage
Synchronous console ~1M msgs/sec 50-70%
Synchronous file ~500K msgs/sec 30-50%
Async console ~10M msgs/sec 80-100%
Async file ~8M msgs/sec 70-90%

Memory Usage

Component Memory per instance
Logger ~100 bytes
LogEvent ~200 bytes
ThreadLocal StringBuilder ~2KB per thread
Async ring buffer (8192) ~1.6MB
Async ring buffer (65536) ~13MB

Optimization Techniques

1. Early Exit

Problem: Formatting messages for disabled log levels wastes CPU

Solution: Check log level before formatting

public void info(String message, Object... args) {
    if (!isInfoEnabled()) {
        return;  // Early exit - zero allocation
    }
    // Format and log
}

Impact: Disabled log statements cost only ~2-5ns

2. ThreadLocal StringBuilder

Problem: Creating new StringBuilder for each log statement causes GC pressure

Solution: Reuse ThreadLocal StringBuilder

private static final ThreadLocal<StringBuilder> BUFFER =
    ThreadLocal.withInitial(() -> new StringBuilder(256));

public String format(String pattern, Object... args) {
    StringBuilder sb = BUFFER.get();
    sb.setLength(0);  // Reset for reuse
    // Format into sb
    return sb.toString();  // Only allocation
}

Impact: Reduces allocations from N+1 to 1 per log statement

3. Immutable LogEvent

Problem: Mutable events require synchronization for async logging

Solution: Use immutable Java 17 record

public record LogEvent(
    Instant timestamp,
    LogLevel level,
    String loggerName,
    String message,
    Throwable throwable,
    Map<String, String> mdc,
    Marker marker
) {
    // Immutable - safe to pass between threads
}

Impact: Thread-safe without synchronization, compact memory layout

4. LMAX Disruptor

Problem: Lock-based queues cause contention under high load

Solution: Lock-free ring buffer with CAS operations

graph LR
    A[Producer Thread] -->|Publish| B[Ring Buffer]
    B -->|Consume| C[Consumer Thread]
    C -->|Write| D[Appenders]

    style B fill:#ab47bc,stroke:#6a1b9a,stroke-width:4px,color:#fff

Impact: 10x faster than ArrayBlockingQueue

5. Batch Processing

Problem: Flushing after every log event causes I/O overhead

Solution: Buffer and flush in batches

private final List<LogEvent> buffer = new ArrayList<>(100);

public void append(LogEvent event) {
    buffer.add(event);
    if (buffer.size() >= 100) {
        flush();
    }
}

Impact: Reduces I/O calls by 100x

Performance Comparison

vs. Logback

Metric Hermes Logback Improvement
Disabled log statement 2-5ns 5-10ns 2x faster
Sync throughput 1M/sec 500K/sec 2x faster
Async throughput 10M/sec 2M/sec 5x faster
Memory per log 200 bytes 400 bytes 2x less

vs. Log4j2

Metric Hermes Log4j2 Improvement
Disabled log statement 2-5ns 3-8ns 1.5x faster
Sync throughput 1M/sec 800K/sec 1.25x faster
Async throughput 10M/sec 7M/sec 1.4x faster
Startup time <100ms 500ms+ 5x faster

Benchmarking Hermes

JMH Benchmark

@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.SECONDS)
@State(Scope.Benchmark)
public class HermesBenchmark {

    private Logger log;

    @Setup
    public void setup() {
        log = LoggerFactory.getLogger(HermesBenchmark.class);
    }

    @Benchmark
    public void disabledLog() {
        log.debug("This is disabled");
    }

    @Benchmark
    public void enabledLog() {
        log.info("Message with {} args", 123);
    }
}

Running Benchmarks

mvn clean install
java -jar target/benchmarks.jar

Performance Tuning

1. Async Logging

Enable async for high-throughput scenarios:

hermes:
  async:
    enabled: true
    queue-size: 8192  # Tune based on load

Guidelines:

  • Low volume (<1K/sec): queue-size=1024
  • Medium volume (1K-10K/sec): queue-size=8192
  • High volume (>10K/sec): queue-size=65536

2. Conditional Logging

Check log level before expensive operations:

if (log.isDebugEnabled()) {
    log.debug("Result: {}", expensiveComputation());
}

3. Avoid String Concatenation

// ❌ Bad - concatenation happens regardless of level
log.debug("User: " + user.getName() + ", Age: " + user.getAge());

// ✅ Good - only formatted if DEBUG enabled
log.debug("User: {}, Age: {}", user.getName(), user.getAge());

4. Minimize MDC Usage

MDC lookups have overhead:

// ❌ Bad - MDC operations in hot path
for (int i = 0; i < 1000000; i++) {
    MDC.put("iteration", String.valueOf(i));
    log.info("Processing");
    MDC.remove("iteration");
}

// ✅ Good - MDC at higher level
MDC.put("batchId", batchId);
for (int i = 0; i < 1000000; i++) {
    log.info("Processing iteration {}", i);
}
MDC.remove("batchId");

5. Choose Appropriate Layouts

Layout Overhead Use Case
SimpleLayout Very low Development
PatternLayout Low Production console/file
JsonLayout Medium Log aggregation systems

Performance Monitoring

Metrics to Track

  1. Throughput: Messages logged per second
  2. Latency: Time to publish to ring buffer
  3. Dropped events: Events lost due to full queue
  4. Queue utilization: Current buffer usage
  5. GC pressure: Allocations from logging

Monitoring Code

@Component
public class LoggingMetrics {

    private final AsyncAppender asyncAppender;
    private final MeterRegistry registry;

    @Scheduled(fixedRate = 60000)
    public void recordMetrics() {
        registry.gauge("logging.queue.size",
            asyncAppender.getQueueSize());
        registry.gauge("logging.dropped.events",
            asyncAppender.getDroppedEventCount());
    }
}

Performance Troubleshooting

High CPU Usage

Symptoms: CPU at 100% during logging

Causes:

  • Too much synchronous logging
  • Slow appenders (network, file I/O)
  • String concatenation in log messages

Solutions:

  • Enable async logging
  • Check log level before expensive operations
  • Use parameterized logging

High Memory Usage

Symptoms: Heap growing, frequent GC

Causes:

  • Large async queue size
  • Many threads (ThreadLocal buffers)
  • Memory leak in appenders

Solutions:

  • Reduce queue size
  • Limit thread pool size
  • Check appender lifecycle (start/stop)

Dropped Events

Symptoms: getDroppedEventCount() > 0

Causes: - Queue too small for load - Slow consumers (file I/O, network) - Burst traffic

Solutions: - Increase queue size - Enable blocking mode (if acceptable) - Use faster appenders

Best Practices Summary

  1. ✅ Enable async logging for high throughput
  2. ✅ Check log level before expensive operations
  3. ✅ Use parameterized logging (not concatenation)
  4. ✅ Keep MDC values at appropriate scope
  5. ✅ Choose efficient layouts
  6. ✅ Monitor queue utilization
  7. ✅ Profile and benchmark regularly
  8. ❌ Don't log in tight loops
  9. ❌ Don't use string concatenation
  10. ❌ Don't ignore dropped events