diff --git a/language/duchain/Mainpage.dox b/language/duchain/Mainpage.dox index 8ca1a75a1..72feeda80 100644 --- a/language/duchain/Mainpage.dox +++ b/language/duchain/Mainpage.dox @@ -1,360 +1,360 @@ /*! * @mainpage Definition-Use Chain and Type System * * Overview | \ref duchain-design "Design" | \ref Implementing "Implementing" | \ref Using "Using" * * The definition-use chain and type system provide a language-neutral * representation of source code structure, used to provide language-based * features to all implemented languages in a generic manner. * * An introduction to the duchain can be found in the \ref duchain-design document. * * Details about how to provide a duchain and type system for your favourite * language can be found here: \ref Implementing. * * @licenses * @lgpl * * For questions and discussons about editor either contact the author * or the kdevelop-devel@kde.org * mailing list. */ /** \page duchain-design Definition-Use Chain Design * @ref index "Overview" | Design | \ref Implementing "Implementing" | \ref Using "Using" \section overview Overview The duchain is a sequence of contexts in a code file, and the associated definitions which occur in those contexts. A simplified way of thinking about it is that for each set of brackets (curly {} or not ()), there is a separate context. Each context is represented by a \ref KDevelop::DUContext. Each context will have one parent context (except in the case of the top level context which has none), and any number of child contexts (including none). Additionally, each context can import any number of other contexts. The reason for this will become clear later. Thus, the \ref KDevelop::DUContext structure resembles a directed acyclic graph, for those familiar with the concept. \section parsing Parsing These \ref KDevelop::DUContext "DUContexts" are created on the first pass after parsing the code to an AST (abstract syntax tree). Also, in this stage the data types are parsed, and any declarations which are encountered are recorded against the context in which they are encountered in. Each declaration is represented by a Declaration. Parsing code is arranged into builder classes, which subclass the AST visitor pattern. They are designed to be able to subclass each other, thus achieving multiple goals with each pass (as described in the above paragraph). For most languages, the first pass is accomplished by the \ref KDevelop::AbstractContextBuilder "AbstractContextBuilder", \ref KDevelop::AbstractTypeBuilder "AbstractTypeBuilder", and \ref KDevelop::AbstractDeclarationBuilder "AbstractDeclarationBuilder". The customised builder class is a subclass of each of these classes. Thus, in the first pass, the \ref KDevelop::AbstractContextBuilder "AbstractContextBuilder" creates the \ref KDevelop::DUContext "DUContext" tree, the \ref KDevelop::AbstractTypeBuilder "AbstractTypeBuilder" records which \ref KDevelop::AbstractType "types" are encountered, and the \ref KDevelop::AbstractDeclarationBuilder "AbstractDeclarationBuilder" creates \ref KDevelop::Declaration "Declaration" instances which are associated with the current type and context. The second pass is the creation of uses, accomplished a subclass of both the \ref KDevelop::AbstractContextBuilder and the \ref KDevelop::AbstractUseBuilder. On the second pass, we only iterate previously parsed contexts (as they are already created). Then, as variable uses are encountered, a \ref KDevelop::Use is created for each. A \ref KDevelop::Declaration is searched for in the current context, and if one is found, they are associated with each other. \section classes Classes and their purposes \li \ref KDevelop::DUChain - a global object which keeps track of all loaded source files and the top level context of their definition-use chains. -\li \ref KDevelop::DUContext - an object which represents a single context in a source file, and stores information about parent and child \ref KDevelop::DUContext "DUContexts", and \ref KDevelop::Declarations "Declarations", \ref KDevelop::Definitions "Definitions" and \ref KDevelop::Use "Uses" which occur in them. Also provides convenience methods for searching the chain. +\li \ref KDevelop::DUContext - an object which represents a single context in a source file, and stores information about parent and child \ref KDevelop::DUContext "DUContexts", and \ref KDevelop::Declaration "Declarations", \ref KDevelop::Definition "Definitions" and \ref KDevelop::Use "Uses" which occur in them. Also provides convenience methods for searching the chain. \li \ref KDevelop::Declaration - an object which represents a single declaration. Has several subclasses which store more information specific to the type of declaration which is being represented. \li \ref KDevelop::Use - an object which represents a use of a particular declaration. \li \ref KDevelop::PersistentSymbolTable - a hash which stores identifiers available in the top level context of a source file and their respective \ref KDevelop::Declaration "Declarations". \li KDevelop::*Builder - objects whose purpose is to iterate the parsed AST and produce instances of the duchain objects. \li \ref KDevelop::AbstractType - the base class for types. \section searching-duchain Definition-use chain searching Because iterating a complete definition-use chain can become expensive when they are large, when a search is being performed (eg. for a declaration corresponding to a certain identifier) it is first performed up to the top level context, then the symbol table is consulted. The symbol table is a hash of all identifiers which are known to the entire duchain. All potential matches are evaluated to see if they are visible from the location of the use. \section locking Locking The duchain is designed to operate in a multithreaded environment. This means that multiple parse jobs may be operating simultaneously, reading from and writing to the duchain. Thus, locking is required. A single read-write lock is used to serialise writes to the chain and allow concurrent reads. Thus, to call non-const methods, you must hold a write lock, and for const methods, a read lock. Customised read/write lockers have been created, called DUChainWriteLocker and DUChainReadLocker. You must not request a write lock while holding a read lock, or you could cause a deadlock. Also, when manipulating text editor ranges, the \ref KTextEditor::SmartInterface must be locked. \warning You must never attempt to acquire the duchain read or write lock when holding the smart lock, else you may cause a deadlock. See code in \ref KDevelop::AbstractContextBuilder::openContextInternal and \ref KDevelop::DUChainBase. \section plugin-interface Interface for plugins As plugins will be accessing the \ref KDevelop::DUChain from the main thread, they will need to hold a read lock. \section text-editor-integration Text editor integration The main classes are subclasses of a base class, \ref KDevelop::DUChainBase. This object holds a reference to the text range. When the source file is opened in an editor, the \ref KDevelop::EditorIntegrator will create smart text ranges, which are bound to the editor's copy of the document. From there, highlighting can be applied to these ranges, as well as other advanced functions (see the \ref KTextEditor documentation for possibilities). The language support will convert these ranges to smart ranges when the corresponding document is loaded into an editor. \section future Future features - ideas The completed duchain should allow for code refactoring, intelligent navigation, improved automatic code generation (eg. "create switch statement"), context-sensitive code completion, integration of documentation, debugger integration, a code structure view, call graph, static code analysis etc. */ /** * \page Implementing Implementing Definition-Use Chains for a specific language * * \ref index "Overview" | \ref duchain-design "Design" | Implementing | \ref Using "Using" * * \section create Creating the Definition-Use Chain * * To create a definition-use chain for a programming language, you need the following: * \li a parser for the language, * \li a context builder, * \li a type builder, * \li a declaration builder, * \li and a use builder. * * Once you have everything up to the declaration builder, your language's classes, functions etc. * will automatically appear in the class browser, and be able to perform limited refactoring. * * Once you have the use builder, you will automatically have full support for context browsing. * * Code completion support requires further work specific to your language, see \ref cc * * \subsection parser Parser * Parsers in %KDevelop can be created in any way as long as they produce an AST (abstract * syntax tree). Most supported languages have parsers generated by kdevelop-pg-qt. * This is a LL parser generator, and allows you to specify the grammar from which the * parser and AST are generated. The parser will also need a lexer, common solutions are to * use flex to create one for you, or to create one by hand. * * \subsection builders DUChain Builders * * The abstract builder classes (detailed below) provide convenience functions for creating a * definition-use chain. They are template classes which require 2 or 3 class * types: * - T: your base AST node type * - NameT: your identifier AST node type, if you have only one, or your base AST node type * if more than one exist * - Base class: your base class, eg. for your use builder, you will usually supply your custom * context builder here. * * \subsection context Context Builder * By subclassing \ref KDevelop::AbstractContextBuilder "AbstractContextBuilder", you will have everything you need to * keep track of contexts as you iterate the AST. When a new context is encountered, such * as a new block (eg. between {} brackets), create a new context with KDevelop::AbstractContextBuilder::openContext(), * and close it with KDevelop::AbstractContextBuilder::closeContext(). * * Some languages do not need a context to be created for each block, for example languages * where declarations are visible after the block in which they were defined (eg. php). * * \subsection type Type Builder * By subclassing \ref KDevelop::AbstractTypeBuilder "AbstractTypeBuilder", you can create types * when one is encountered in your AST by calling openType(). Again, you need to closeType() * when the type is exited. Complex types are built up this way by creating the type at each node, * ie. with int[], first an array type is opened, then an integral type representing an integer * is opened and closed, then when the array type is closed, you can retrieve the lastType() * and set that as the type which is being made into an array. * * \subsection declaration Declaration Builder * By subclassing \ref KDevelop::AbstractDeclarationBuilder "AbstractDeclarationBuilder", you can create * declarations when they are encountered in your AST. Usually you will assign the lastType() * or currentType() to them within closeDeclaration(). * * \subsection use Use Builder * By subclassing \ref KDevelop::AbstractUseBuilder "AbstractUseBuilder", you can create uses when they are encountered * in your AST, and they will be automatically registered with the current context. * * \subsection expr Expression Visitor * By subclassing \ref KDevelop::AbstractExpressionVisitor "AbstractExpressionVisitor", you can provide a convenient * way to request the type and/or instance information for a given expression. This is useful in both * figuring out what code completion to offer when it is invoked after or inside an expression, and to * allow complete use building, ie. for uses within expressions. In c++, this is a very complex matter, * involving type conversion and overload resolution, while still tracking template types. * * \section cc Implementing Code Completion * * To provide code completion for your language, you will need to implement the following: * \todo complete this section */ /** * \page Using Using already created Definition-Use Chains in plugins * * \ref index "Overview" | \ref duchain-design "Design" | \ref Implementing "Implementing" | Using * * \section intro Introduction * This section is designed for developers who want to use definition-use chains, for example to provide * code generation, refactoring, or other advanced language-specific functionality. First some important * fundamentals of using the duchain classes will be covered. * * \subsection pointers Definition-use chain pointers and references * As the definition-chain is a dynamic entity, safe pointers (DU*Pointer) and indirect references (Indexed*) * are required to reference objects in a thread-safe way, and in a way that allows minimisation of memory use by saving * non-referenced chains to disk. While you do not hold the KDevelop::DUChain::lock(), * these pointers and references should not be accessed, because the objects they will return may be * modified by other threads. * * The KDevelop::DUChain::lock() is a read-write lock, which means that if you don't intend to * change the chain (which you won't, unless you are a language plugin developer), you only need a read-lock. * This has the advantage of allowing multiple threads to safely read from the chain simultaneously. * The easiest way to acquire this lock is to use KDevelop::DUChainReadLocker: * \code * KDevelop::TopDUContextPointer topContext; * * // Retrieve the top context for myUrl (see explanation below) * topContext = KDevelop::DUChainUtils::standardContextForUrl( myUrl ); * * // Lock the duchain for reading * KDevelop::DUChainReadLocker readLock( KDevelop::DUChain::lock() ); * * // Check if the top context pointer is valid * if ( topContext ) { * ... * } * \endcode * Before accessing the top context, this code will block until a read-only lock has been acquired. * It is then safe to access const functions of all duchain objects. The lock will continue to be held * until readLock goes out of scope. * * \note It is safe to recursively acquire a read-lock (or a write-lock), but not safe to request a write lock * once a read lock is held (this may result in a deadlock). * \note You must not attempt to acquire the duchain lock when you already hold the smart lock (this may result in a deadlock). * * In debug builds, if you attempt to access something in the duchain which you do not hold the proper lock * for, you will encounter an assert (usually triggered by the ENSURE_CHAIN_READ_LOCKED or ENSURE_CHAIN_WRITE_LOCKED macros). * * For more information about duchain pointers, see KDevelop::DUChainPointer. * * \section searching-declarations Searching for declarations * If you have a qualified identifier and want to retrieve declaration(s) which share the * identifier, you can do this by using the symbol table (KDevelop::PersistentSymbolTable): * * \code * KDevelop::DUChainReadLocker lock( KDevelop::DUChain::lock() ); * KDevelop::PersistentSymbolTable::Declarations declarations = KDevelop::PersistentSymbolTable::self().getDeclarations( qualifiedIdentifier ); * * for (PersistentSymbolTable::Declarations::Iterator it = declarations.iterator(); it; ++it) { * KDevelop::Declaration* decl = it->declaration(); * // Use the declaration... * } * \endcode * * \section accessing Accessing a definition-use chain * The first step in using a duchain is to retrieve the chain that you are interested in. * Presumably you will know the URL of the file for which you want to retrieve the chain. * Some languages (notably C and C++) can have several different chains for one file depending on * what the definitions of macros were when the files were parsed. Because of this, the recommended * way to access the duchain for a document is via KDevelop::DUChainUtils. * * \subsection accessing-topcontext Accessing top level contexts * \todo include a note on how to request loading of contexts from disk, and requesting parsing of files which * are not currently in the duchain. * * Top level contexts (TopDUContext) can be retrieved through KDevelop::DUChainUtils::standardContextForUrl(). * This is the context which is presented to the user when the file is opened (for highlighting, completion etc.). * In case it is not the context which you are after, all contexts for a file can be retrieved via * KDevelop::DUChain::chainsForDocument(). * * \subsection accessing-declaration Accessing declarations at a specific location * If you have a url and a cursor location, you can attempt to retrieve the declaration located at that position * with KDevelop::DUChainUtils::itemUnderCursor(). * * \section navigating Navigating a definition-use chain * \subsection navigating-duobject All duchain objects * All duchain objects inherit from KDevelop::DUChainBase. This is in turn a subclass of KDevelop::DocumentRangeObject. * Thus, you can retrieve the text range of every object via KDevelop::DocumentRangeObject::range(). If the document * is currently loaded in a text editor, it will likely have a smart range (KTextEditor::SmartRange), which tracks the position * of the range when the document is changed. This can be accessed via KDevelop::DocumentRangeObject::smartRange(). * * \subsection navigating-contexts Contexts * Now that you have a chain, you'll probably want to be able to navigate around it. You can iterate contexts * by using KDevelop::DUContext::childContexts(). You can then retrieve from each * context a list of local declarations with KDevelop::DUContext::localDeclarations(), and a list of all * uses in the context with KDevelop::DUContext::uses(). * * Imported contexts are usually contexts which have declarations which are visible in the current context. * For example: * \code * for (int i = 0; i < count(); ++i) { * qDebug() << i; * } * \endcode * The code which contains the debug statement will import the for conditions context, which contains the declaration * of i. Thus, i's declaration is visible to the debug statement. * * Usually, you will not have to worry about these details, as the search functions already take them into account. * If you want to find a declaration for a given identifier in a given context, you can use one of the * KDevelop::DUContext::findDeclarations() or KDevelop::DUContext::findLocalDeclarations() functions. * * \subsection navigating-declarations Declarations * Declarations always occur within a context, which can be accessed through KDevelop::Declaration::parentContext(). * Some declarations (eg. namespaces, classes) create a new context which can then contain child declarations, eg. * variables and functions within a class. For these declarations, the associated context which contains these * child declarations can be accessed through KDevelop::Declaration::internalContext(), if one exists. * * \subsection navigating-uses Uses * Uses are instances where a declaration is referenced in the code. All uses for a declaration can be calculated * from the duchain, although this can potentially be a time-consuming task. KDevelop::Declaration::uses() will return * all uses for a declaration, and KDevelop::Declaration::smartUses() will return smart ranges which represent all uses * in the currently opened documents. * * \subsection navigating-types Types * Declarations may have a type, which can be retrieved through KDevelop::Declaration::abstractType(). Types can then * be visited using KDevelop::TypeVisitor, or manually with the corresponding calls in the type subclasses. Types can be * compared for equality using KDevelop::AbstractType::equals(). * * \section efficiency DUChain efficiency issues * It was confirmed during the implementation of the DUChain that there is too much information to store the duchain for * an entire project in memory at the same time (KDevPlatform itself was >1Gb). Subsequently, saving the chains to disk * has been implemented. Following are some of the ramifications of this design. * * \subsection referenced-topcontexts Top Context Referencing * In order to determine which chains can be unloaded from memory, a referenced pointer was introduced called * KDevelop::ReferencedTopDUContext. If you are using duchain objects outside of a duchain lock, and you need them to * remain in memory, you should create a KDevelop::ReferencedTopDUContext for the top context of each of the chains you need. * This will ensure it is not unloaded. However, do not use this excessively or %KDevelop will have the same problem * of using large amounts of memory. * * \code * KDevelop::TopDUContextPointer topContext; * KDevelop::ReferencedTopDUContext topReferenced * * topContext = KDevelop::DUChainUtils::standardContextForUrl( myUrl ); * topReferenced = KDevelop::DUChainUtils::standardContextForUrl( myOtherUrl ); * * // Both of these pointers may be valid here * * // Sleep * sleep(10); * * // Lock the duchain for reading * KDevelop::DUChainReadLocker readLock( KDevelop::DUChain::lock() ); * * // topContext may not be valid any more, because it may have been saved to disk and unloaded from memory. * // topReferenced will still be valid if it was valid when it was retrieved (above). * \endcode * * \subsection code-model Code Model * In order to facilitate easy access to top level declarations, a list of top level declarations is available from * KDevelop::CodeModel. For each parsed file, you can call KDevelop::CodeModel::items() to retrieve a list of declarations * and some basics about their type. If you need any further information, the chain must be loaded from disk. * \code * uint count; * const CodeModelItem* items; * IndexedString file = \; * * // Retrieve the items for the given file * KDevelop::CodeModel::self().items(file, count, items); * * for (int i = 0; i < count; ++i) { * CodeModelItem* thisItem = items++; * * // Use the item here. * ... * } * \endcode * * To access the declaration for each item, use KDevelop::PersistentSymbolTable::declarations(). * * \section inadequate When the duchain doesn't contain all the information * If you need more information than is available in the duchain, you're most likely looking at using the AST generated by the * language support. Note that this is obviously not language-independent, so it should be a last resort in cases where * the functionality being supplied is not language-specific. If the duchain is missing some information that would make * sense to add, please raise it with the %KDevelop developers. * * \todo add mechanism to get at the AST * \todo keep the AST in memory for loaded files */ // DOXYGEN_REFERENCES = language/editor // DOXYGEN_SET_WARN_LOGFILE=language/duchain/doxygen.log // DOXYGEN_SET_RECURSIVE=yes