You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/src/devdocs/pkgimage-loading-performance.md
+42-2Lines changed: 42 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -427,9 +427,49 @@ Potential future approaches:
427
427
| Refactor `jl_codegen_params_t` for explicit lock management | Enable shared context | Very High | Requires careful analysis of all shared state |
428
428
| Pipeline parallelism (inference ‖ codegen ‖ LLVM opt) | Better utilization | High | Data dependencies, buffering |
429
429
| Batch codegen at module granularity | Coarser parallelism | Medium | Load balancing |
430
-
| Parallel LLVM optimization (already done) | ✅ Already parallelized | - |3 threads used for native code gen|
430
+
| Parallel LLVM optimization (already done) | ✅ Already parallelized | - |Variable threads based on module size|
431
431
432
-
**Current status:** Parallel codegen disabled. The native code generation phase already uses 3 threads (visible in debug output), so some parallelism exists in the later stages.
432
+
**Current status:** Parallel codegen disabled. The native code generation phase uses parallel LLVM optimization (visible in debug output as "threads: N"), with thread count determined dynamically based on module complexity.
433
+
434
+
### Thread Pool Coordination
435
+
436
+
When multiple Julia processes run parallel precompilation (e.g., `Pkg.precompile()`), each process independently decides how many threads to use for native code generation. This can lead to thread oversubscription on the system.
437
+
438
+
A thread pool mechanism allows coordination across processes:
439
+
440
+
**Location:** The thread pool file is stored at `DEPOT_PATH[1]/compiled/threadpool` (derived automatically from the precompilation output path).
441
+
442
+
**Environment Variables:**
443
+
444
+
-`JULIA_IMAGE_THREAD_POOL`: Set to `0` to disable cross-process thread coordination. Enabled by default.
445
+
-`JULIA_IMAGE_THREAD_POOL_SIZE`: Maximum threads in the pool (default: number of CPU threads). This limits total threads used across all concurrent precompilation workers.
446
+
447
+
**Usage Example:**
448
+
449
+
```bash
450
+
# Limit thread pool to 8 threads across all workers
451
+
export JULIA_IMAGE_THREAD_POOL_SIZE=8
452
+
julia -e "using Pkg; Pkg.precompile()"
453
+
454
+
# Disable thread pool coordination (each worker uses all available threads)
455
+
export JULIA_IMAGE_THREAD_POOL=0
456
+
julia -e "using Pkg; Pkg.precompile()"
457
+
```
458
+
459
+
**How it works:**
460
+
461
+
1. Thread acquisition happens just before LLVM parallel optimization begins (not at precompilation start)
462
+
2. Each worker acquires up to its desired threads from the pool, waiting if necessary
463
+
3. Threads are released immediately after LLVM optimization completes
464
+
4. This just-in-time approach minimizes lock contention since native code gen is only ~24% of total precompilation time
465
+
5. The pool file is automatically located at `~/.julia/compiled/threadpool` (or equivalent depot path)
466
+
467
+
**Debug output:** When `JL_DEBUG_SAVING` is enabled, thread pool operations are logged:
468
+
469
+
```text
470
+
[pkgsave] thread pool: acquired 3 threads (2 -> 5 in use, pool size 8)
471
+
[pkgsave] thread pool: released 3 threads (5 -> 2 in use)
0 commit comments