-
-
Notifications
You must be signed in to change notification settings - Fork 114
Include detection caching #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This function never fails, so no need to return and check for an error value. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Originally, this code was written to find all include statements in a source file, for further processing. However, since a while the include finding was changed to run the preprocessor to see if any includes are missing, and this class is used to find the missing includes in the error message. In practice, the preprocessor will bail out at the first missing include, since there is no meaningful way for it to continue. This means there will be at most one include in the error output. If there would be multiple includes (either because the preprocessor decided to continue, or there is some other output that accidentially matches the regex), it would be harmful to actually process all of them, since the absence of the first include might influence the presence of the second (e.g. in the presence of #ifdefs). This commit enforces that only one include is returned at a time, slightly simplifying the code in the process. The filename and class should be renamed to singular as well, but this will be left for a future commit to reduce noise. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Previously, during include detection, every pass through IncludesToIncludeFolders would iterate over all included filenames found so far, and rebuild the list of include folders from the list of included libraries again. This commit consistes of these changes: - IncludesToIncludeFolders now looks at ctx.IncludeJustFound to find new libraries (instead of ctx.Includes which contains all included filenames found so far). To minimize changes in this commit, it still creates a slice with just the single include found and processes that. A future commit will simplify this. - With that change, ctx.Includes is no longer used so it is removed. - The resolveIncludeFolders function is removed. It used to build the list of include folders, based on the list of imported libaries. This list was built from scratch again every time, including the core and variant folders. These two folders are now added once, by ContainerFindIncludes, and adding the library's source folders to the include folder list is done by IncludesToIncludeFolders. - ContainerFindIncludes would start by calling IncludesToIncludeFolders (even before any includes were detected) to ensure that the include path contained the core and variant folders on the first preprocessor run already. Now that these paths are added by ContainerFindIncludes, there is no need for this extra call to IncludesToIncludeFolders, so it is removed. - resolveLibraries is modified to actually return only newly imported libraries, not all imported libraries. - resolveLibrary now returns the library it resolved to, instead of just adding it to the markImportedLibrary map. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
This code was written to handle processing of multiple includes after each other, but it only needs to handle one. This allows simplifying the code a bit. Behaviour should not be changed. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
The removed code iterates over the detected libraries and adds their source directories to the queue for running further detection on them. However, IncludesToIncludeFolders (called by findIncludesUntilDone(), which is called a bit above the removed code) already does this, so there is no need to do it again. Since Ctx.FoldersWithSources is a UniqueStringQueue, this unneeded code was not affecting behaviour in any way. It seems likely that this code had a purpose previously, but it was lost in some refactor. However, a quick scan of the commit that introduced it (d6e378: Implemented library to library dependency full discovery) suggests that this code was useless from the start. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Previously, it was called once for the main cpp file, and then in a loop for any extra source files from the sketch or dependent libraries. However, this can be simplified by just adding the main .cpp file to the queue, and then letting the loop handle it. This produces a tiny change in the order in which source files are processed for includes, but any setups affected by this are already prone to errors. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
This field was used during include detection, to collect the source folders whose source files should be processed for includes. This field was handled by the CollectAllSourceFilesFromFoldersWithSources pass, which finds all source files in these folders and adds them to the Context.CollectedSourceFiles queue instead. This commit changes all code that used to add directories to the FoldersWithSourceFiles queue to instead add all contained source files to the CollectedSourceFiles queue. For this, the CollectAllSourceFilesFromFoldersWithSources pass is transformed into regular function and renamed to QueueSourceFilesInFolder, so it can be directly called without needing any particular context entries. Note that the filename that contains the latter function is no longer entirely accurate, but this will be fixed in a later commit (left untouched in this commit to simplify review). Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
This prepares for a change in the next commit. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
In previous commits, this method has been signficantly simplified, reducing it to little more than a wrapper around the resolveLibrary helper function. This commit renames resolveLibrary to ResolveLibrary to make it public and lets findIncludesUntilDone call it directly (instead of going through IncludesToIncludeFolders). findIncludesUntilDone now also takes care of processing the ResolveLibrary result, which was previously done by IncludesToIncludeFolders. The signature of ResolveLibrary is also changed to accept a Context, so it can now extract the needed variables itself, instead of needing IncludesToIncludeFolders to extract them. Note that the filename that contains ResolveLibrary is no longer entirely accurate, but this will be fixed in a later commit (to simplify review of this commit). Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
This function used to be a pass, but it was reduced to a regular function in 1001916 (Remove Context.FoldersWithSourceFiles). By now, the function is only called from container_find_includes.go, so it makes sense to just move it into that file. This does not change any code, it just moves the function and renames it to be private. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Since 631b09f (Remove IncludesToIncludeFolders.Run), the primary function in this file changed to become ResolveLibrary. This updates the filename to match. No code is changed, this just renames a file. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
This struct contains the path of a source file, together with the library or sketch it is contained in. This allows the SourceFile methods to generate the full paths to the source file, object file and dependency file, making it easier to pass around source files. This commit only adds the struct. Next commits will start to use this struct in the include detection, where it fills an immediate need, later it will be used in more places as well. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
This is a queue of SourceFile objects, similar to UniqueStringQueue. It is not used yet, only added. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
This helper does not do much, but it will be expanded in a later commit. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Previously, the include detection kept a queue of strings representing full paths to source files that still needed to be processed and iterated over it. This commit turns this into a queue containing the (new) SourceFile struct. This gives the include detection code a bit more context about a source file, allowing to also generate the object and dependency file paths. Previously this was not feasible, since it did not know the origin (library/sketch) of a path, so it could not decide where within the build path the object file should live. This does not actually use this new context yet, but it will be useful when introducing caching next. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
This greatly reduces compilation time of big sketches (or sketches using big libraries) when only a small change has been mad. Instead of rerunning include detection for *all* source files, it is now only rerun for changed files (and usually more if the actual list of includes changed). Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
✅ Build completed. ⬇️ Build URL: ℹ️ To test this build:
|
I'm trying this now with the Teensy Audio library. Seems to make no difference. The "g++ -E" step is being run on all the (many) library files every time I click Verify. |
It does make a huge speedup when I compile the Ethernet library AdvancedChatServer example for Arduino Mega. |
Hmmm... it works if I try IRremote's IRrecvDumpV2 example, compiling for Teensy 3.2. So it does seem to work with other libs, but there's something about the Teensy Audio library that's causing it to not use the cache data at all. :( I'm testing on Linux 64 bit. To reproduce this, start with a clean copy of 1.6.10 and then run this installer to add all the Teensy stuff into that 1.6.10 copy, then replace arduino-builder. Select Teensy 3.2 from Tools > Boards, and then open File > Examples > Audio > Synthesis > Guitar. Every Verify re-detects all the library dependencies, not using the cached results at all. |
When it does work, wow, really speeds things up! Hopefully this can work for the audio lib... it's a very large library with a lot of files, so perhaps a tough test case? |
@PaulStoffregen, thanks for testing and providing a broken testcase, I'll dig into it (but probably not before this weekend or next week). You link to an installer for Teensyduino, but is there also another installation method available? e.g. just a zip to unpack, or perhaps a boards manager url? I don't really like to run arbitrary binary installers without knowing what they'll do exactly (though it's not a blocking problem, just wondering). Is there anything in teensyduino that cannot be expressed in a board manager json file? I just ran the installer and tested the example you mention, and I can reproduce the problem. |
I found a bit of time in the train and couldn't resist looking into this already :-) Turns out this caching code exposes two existing bugs in the code, which were conveniently hiding each other:
The reason that the second bug doesn't normally cause any actual problems is that if an include is really missing, the error about it will show up later when actually compiling the problematic file. The reason the first bug doesn't normally cause problems is that when such an include from the The first bug could cause problems if a utility-path include is above an include for another library, and that other library is not included from other places (such as the sketch). This would cause inlude detection to miss that other library and the build will fail later on. I'll write a testcase to confirm this behaviour. As for fixing these, I think the fixes are fairly simple:
Fixing this will probably take a while longer, my train ride is just over :-p |
Oh, I didn't realize the utility folder would matter. I'm continuing to use this as I work on another project. This particular code has a lot of extra .cpp and .c and .h files in the sketch folder. It seems to be caching those very nicely. Saves a lot of time when working with more complex code using a lot of files! Very nice. Looks like there may be a similar issue with my heavily modified copy of the SD library. Can't reproduce it with the built-in SD lib. If you really want another test case, I could trim this down to something I can share. |
Does it have a utility folder? If so, it's likely the same issue. |
Both have utility folders, but one is src/utility, the other is just the utility without any src folder. |
Hm, src/utility should not be affected, since that isn't added to the include path during compilation. If you could provide a testcase, that would be useful (I'm also seeing if I can fix the other issue now, so perhaps you could await that before spending time on it). |
When an include could not be resolved, the preprocessor was ran again to show the include error as reported by the compiler to the user. However, the preprocessor was now always run against the main sketch, instead of the file currently being processed. Unless the problem was in fact in the main sketch, this caused no error to be shown and include detection continued with the next file instead of being aborted as intended. This fixes the problem by running the preprocessor on the right file. In the future, this error handling might be a bit more improved, but for now this should be sufficient. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Yup, the one with src/utility seems to always work. Even with minor issues, excellent work on this caching. It's going to make a lot of Arduino users much happier. |
This path is constructed and checked when loading the library instead of in various places among the code, slightly simplifying the code. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
When compiling source files from a library, the utility folder of the library itself is in the include path. When doing include detection, this did not happen, which could lead to compilation errors. Because of the broken (but recently fixed) errorhandling in include detection, this error was not visible to the user. It would only have a visible effect if a library included another library *after* including a file from its utility folder through the include path, and that other library was not otherwise pulled in. In practice, this would almost never occur, but with the new caching the cache would be invalidated and this problem became visible. This is fixed by simply including the utility folder in the include path during include detection. This required explicitly passing the include path through the GCCPreprocRunnerForDiscoveringIncludes and GCCPreprocRunner. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
I just pushed a fix for the previously discovered issues. With this, caching works as expected for the Audio library example you indicated. Let me know if you find other issues, thanks again for testing! |
✅ Build completed. ⬇️ Build URL: ℹ️ To test this build:
|
Works great ! |
I've found one bug/limitation with this implementation: When compilation does not complete successfully, a lot of includes are also re-detected even though nothing changed for them. This is because the caching relies on .d files generated by compilation, so if the compilation finds an error and aborts, the caching works only for file up to and excluding the file with the error. The fix for this would be to let the include detection gcc command actually generate .d files as well, but that requires modifying the recipes, which might be somewhat tricky in terms of compatibility. I'm not sure if there is another easier fix for this... |
In this case the cache is lost and the library detection is performed again from the beginning, but this is not worse compared to the current code when this happens every time, right? If yes, I'm for merging anyway since it's a very good improvement, even with this limitation. |
Correct (it is even still a little bit better, since the cache will still be used up to the file with the error). |
This PR adds caching to the include detection mechanism, greatly improving the speed of builds with a lot of files.
This code was only just finished and has not seen a lot of testing yet. Unit tests should be added for it as well. However, I wanted to get the code out for review ASAP to collect some comments.
This PR is made on top of #174, so the first dozen or so commits should be ignored until that is merged.