-
Notifications
You must be signed in to change notification settings - Fork 476
Experiment: Reactive analysis with skip-lite CMT cache #8092
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
cristianoc
wants to merge
9
commits into
master
Choose a base branch
from
reactive-analysis-experiment
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Collaborator
Author
|
On real project kindly provided: |
Vendor skip-lite library and integrate reactive analysis capabilities: - Vendor skip-lite marshal_cache and reactive_file_collection modules - Modify C++ code to handle ReScript CMT file format (CMI+CMT headers) - Add CmtCache module for mmap-based CMT file reading - Add ReactiveAnalysis module for incremental file processing - Add CLI flags: -cmt-cache, -reactive, -runs - Add README.md with usage and benchmark instructions Benchmark results (~5000 files): - Standard: CMT processing 0.78s, Total 1.01s - Reactive (warm): CMT processing 0.01s, Total 0.20s - Speedup: 74x for CMT processing, 5x total The reactive mode caches processed file_data and uses read_cmt_if_changed to skip unchanged files entirely on subsequent runs.
- Change --create-sourcedirs to default to true (always create .sourcedirs.json) - Hide the flag from help since it's now always enabled - Add deprecation warning when flag is explicitly used - Fix package name mismatches in test projects: - deadcode rescript.json: sample-typescript-app -> @tests/reanalyze-deadcode - rescript-react package.json: @tests/rescript-react -> @rescript/react
Simplify the reactive_file_collection implementation by making 'v t be the concrete record type used at runtime, removing the unused phantom fields and internal wrapper type. This eliminates warning 69 about unused record fields and relies directly on a single record with its process function stored as an Obj.t-based callback.
1b2651a to
9265b64
Compare
rescript
@rescript/darwin-arm64
@rescript/darwin-x64
@rescript/linux-arm64
@rescript/linux-x64
@rescript/runtime
@rescript/win32-x64
commit: |
- CmtCache: rewritten using Unix.stat for file change detection (mtime, size, inode) instead of C++ mmap cache - ReactiveFileCollection: new pure OCaml module for reactive file collections with delta-based updates - ReactiveAnalysis: refactored to use ReactiveFileCollection, collection passed as parameter (no global mutable state) - Timing: only show parallel merge timing when applicable - Deleted skip-lite vendor directory (C++ code no longer needed) This eliminates the Linux/musl C++ compilation issue while maintaining the same incremental analysis performance: - Cold run: ~1.0s - Warm run: ~0.01s (90x faster, skips unchanged files)
de52abf to
210862e
Compare
- Create new analysis/reactive library with: - Reactive.ml: Core combinators (delta type, flatMap with merge) - ReactiveFileCollection.ml: Generic file collection with change detection - Comprehensive tests including multi-stage composition - Remove CmtCache module (logic absorbed into ReactiveFileCollection) - ReactiveAnalysis now uses generic ReactiveFileCollection with: - read_file: Cmt_format.read_cmt - process: CMT analysis function - Test composition: files -> word_counts (with merge) -> frequent_words Demonstrates delta propagation across multiple flatMap stages
- Add ReactiveMerge module for reactive merge of per-file DCE data - Add extraction functions (builder_to_list, create_*) to data modules - Expose types needed for reactive merge (CrossFileItems.t fields, etc.) - ReactiveAnalysis: add iter_file_data, collect_exception_results - runAnalysis: use ReactiveMerge for decls/annotations/cross_file when reactive mode enabled Note: refs and file_deps still use O(n) iteration because they need post-processing (type-label deps, exception refs). Next step is to make these reactive via indexed lookups.
- Add Reactive.lookup: single-key subscription from a collection - Add Reactive.join: reactive hash join between two collections - Add ReactiveTypeDeps: type-label dependencies via reactive join - Add ReactiveExceptionRefs: exception ref resolution via reactive join - Update ARCHITECTURE.md with generated SVG diagrams - Add diagram sources (.mmd) for batch pipeline, reactive pipeline, delta propagation The reactive modules express cross-file dependency resolution declaratively: - ReactiveTypeDeps uses flatMap to index decls by path, then join to connect impl<->intf - ReactiveExceptionRefs uses join to resolve exception paths to declaration locations
Bug fix: - ReactiveFileCollection.process now receives (path, raw) instead of just (raw) - ReactiveAnalysis passes cmtFilePath to DceFileProcessing.process_cmt_file - This fixes @genType annotations being incorrectly collected when .cmti exists Previously, reactive mode passed cmtFilePath:"" which made the .cmti existence check always return false, causing @genType annotations to be collected even for files with interface files (where they should be ignored). Architecture updates: - Updated reactive-pipeline.mmd with accurate node names - Added legend table explaining all diagram symbols - Diagram now shows: VR/TR (per-file refs), all ReactiveTypeDeps fields, ReactiveExceptionRefs flow, and combined output Helper functions added for debugging (kept as useful): - Declarations.length, FileAnnotations.length/iter - References.value_refs_length/type_refs_length - FileDeps.files_count/deps_count
Introduce five new store modules that wrap reactive collections directly, eliminating O(N) freeze/copy operations in the merge phase: - AnnotationStore: wraps FileAnnotations reactive collection - DeclarationStore: wraps Declarations reactive collection - ReferenceStore: combines 7 reactive sources (value_refs, type_refs, type_deps.*, exception_refs) without copying - FileDepsStore: wraps file deps reactive collections - CrossFileItemsStore: iterates reactive collection directly without intermediate allocation Performance improvement on 4900-file benchmark: - Merge time: 37ms → 0.002ms (eliminated) - Warm run total: 165ms → 129ms (22% faster) The stores provide a unified interface that works with both frozen (non-reactive) and reactive data, dispatching to the appropriate implementation at runtime. Signed-off-by: Cristiano Calcagno <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Vendor skip-lite library and integrate reactive analysis capabilities:
Benchmark results (~5000 files):
The reactive mode caches processed file_data and uses read_cmt_if_changed to skip unchanged files entirely on subsequent runs.