TokenSequence (Lexer) - NetBeans API Javadoc (Current Development Version)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

org.netbeans.modules.lexer/2 1.19.0 1

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.netbeans.api.lexer
Class TokenSequence<T extends TokenId>

java.lang.Object
  org.netbeans.api.lexer.TokenSequence<T>

public final class TokenSequence<T extends TokenId>
extends Object
extends Object

Token sequence allows to iterate between tokens of a token hierarchy.
Token sequence for top-level language of a token hierarchy may be obtained by TokenHierarchy.tokenSequence().

Use of token sequence is a two-step operation:

Position token sequence before token that should first be retrieved (or behind desired token when iterating backwards).
One of the following ways may be used:
- move(int) positions TS before token that either starts at the given offset or "contains" it.
- moveIndex(int) positions TS before n-th token in the underlying token list.
- moveStart() positions TS before the first token.
- moveEnd() positions TS behind the last token.
- Do nothing - TS is positioned before the first token automatically by default.
Token sequence will always be positioned between tokens when using one of the operations above (token() will return null to signal between-tokens location).
Start iterating through the tokens in forward/backward direction by using moveNext() or movePrevious().
If moveNext() or movePrevious() returned true then TS is positioned over a concrete token retrievable by token().
Its offset can be retrieved by offset().

An example of forward iteration through the tokens:

   TokenSequence ts = tokenHierarchy.tokenSequence();
   // Possible positioning by ts.move(offset) or ts.moveIndex(index)
   while (ts.moveNext()) {
       Token t = ts.token();
       if (t.id() == ...) { ... }
       if (TokenUtilities.equals(t.text(), "mytext")) { ... }
       if (ts.offset() == ...) { ... }
   }

This class should be used by a single thread only.

Method Summary

boolean createEmbedding(Language<? extends TokenId> embeddedLanguage, int startSkipLength, int endSkipLength)
          Create language embedding without joining of the embedded sections.

boolean createEmbedding(Language<? extends TokenId> embeddedLanguage, int startSkipLength, int endSkipLength, boolean joinSections)
          Create language embedding described by the given parameters.

TokenSequence<? extends TokenId> embedded()
          Get embedded token sequence if the token to which this token sequence is currently positioned has a language embedding.

<ET extends TokenId> TokenSequence<ET> embedded(Language<ET> embeddedLanguage)
          Get embedded token sequence if the token to which this token sequence is currently positioned has a language embedding.

int index()
          Get an index of token to which (or before which) this TS is currently positioned.

boolean isEmpty()
          Check whether this TS contains zero tokens.

Language<T> language()
          Get the language describing token ids used by tokens in this token sequence.

LanguagePath languagePath()
          Get the complete language path of the tokens contained in this token sequence.

int move(int offset)
          Move token sequence to be positioned between index-1 and index tokens where Token[index] either starts at offset or "contains" the offset.

void moveEnd()
          Move the token sequence to be positioned behind the last token.

int moveIndex(int index)
          Position token sequence between index-1 and index tokens.

boolean moveNext()
          Move to the next token in this token sequence.

boolean movePrevious()
          Move to a previous token in this token sequence.

void moveStart()
          Move the token sequence to be positioned before the first token.

int offset()
          Get the offset of the current token in the underlying input.

Token<T> offsetToken()
          Similar to token() but always returns a non-flyweight token with the appropriate offset.

TokenSequence<T> subSequence(int startOffset)
          Create sub sequence of this token sequence that only returns tokens above the given offset.

TokenSequence<T> subSequence(int startOffset, int endOffset)
          Create sub sequence of this token sequence that only returns tokens between the given offsets.

Token<T> token()
          Get token to which this token sequence points to or null if TS is positioned between tokens (moveNext() or movePrevious() were not called yet).

int tokenCount()
          Return total count of tokens in this sequence.

String toString()


Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait`

Method Detail

language

public Language<T> language()

Get the language describing token ids used by tokens in this token sequence.

languagePath

public LanguagePath languagePath()

Get the complete language path of the tokens contained in this token sequence.

token

public Token<T> token()

Get token to which this token sequence points to or null if TS is positioned between tokens (moveNext() or movePrevious() were not called yet).
A typical iteration usage:

   TokenSequence ts = tokenHierarchy.tokenSequence();
   // Possible positioning by ts.move(offset) or ts.moveIndex(index)
   while (ts.moveNext()) {
       Token t = ts.token();
       if (t.id() == ...) { ... }
       if (TokenUtilities.equals(t.text(), "mytext")) { ... }
       if (ts.offset() == ...) { ... }
   }

The returned token instance may be flyweight (Token.isFlyweight() returns true) which means that its Token.offset(TokenHierarchy) will return -1.
To find a correct offset use offset().
Or if its necessary to revert to a regular non-flyweigt token the offsetToken() may be used.

The lifetime of the returned token instance may be limited for mutable inputs. The token instance should not be held across the input source modifications.

Returns:: token instance to which this token sequence is currently positioned or null if this token sequence is not positioned to any token which may happen after TS creation or after use of move(int) or moveIndex(int).
See Also:: offsetToken()

offsetToken

public Token<T> offsetToken()

Similar to token() but always returns a non-flyweight token with the appropriate offset.
If the current token is flyweight then this method replaces it with the corresponding non-flyweight token which it then returns.
Subsequent calls to token() will also return this non-flyweight token.

This method may be handy if the token instance is referenced in a standalone way (e.g. in an expression node of a parse tree) and it's necessary to get the appropriate offset from the token itself later when a token sequence will not be available.

Throws:: IllegalStateException - if token() returns null.

offset

public int offset()

Get the offset of the current token in the underlying input.
The token's offset should never be computed by a client of the token sequence by adding/subtracting tokens' length to a client's variable because in case of the immutable token sequences there can be gaps between tokens if some tokens get filtered out.
Instead this method should always be used because it offers best performance with a constant time complexity.

Returns:: >=0 absolute offset of the current token in the underlying input.
Throws:: IllegalStateException - if token() returns null.

index

public int index()

Get an index of token to which (or before which) this TS is currently positioned.

Initially or after move(int) or moveIndex(int) token sequence is positioned between tokens:

          Token[0]   Token[1]   ...   Token[n]
        ^          ^                ^
 Index: 0          1                n

After use of moveNext() or movePrevious() the token sequence is positioned over one of the actual tokens:

          Token[0]   Token[1]   ...   Token[n]
             ^          ^                ^
 Index:      0          1                n

Returns:: >=0 index of token to which (or before which) this TS is currently positioned.

embedded

public TokenSequence<? extends TokenId> embedded()

Get embedded token sequence if the token to which this token sequence is currently positioned has a language embedding.
If there is a custom embedding created by createEmbedding(Language,int,int) it will be returned instead of the default embedding (the one created by LanguageHierarchy.embedding() or LanguageProvider).

Returns:: embedded sequence or null if no embedding exists for this token.
Throws:: IllegalStateException - if token() returns null.

embedded

public <ET extends TokenId> TokenSequence<ET> embedded(Language<ET> embeddedLanguage)

Get embedded token sequence if the token to which this token sequence is currently positioned has a language embedding.

Throws:: IllegalStateException - if token() returns null.

createEmbedding

public boolean createEmbedding(Language<? extends TokenId> embeddedLanguage,
                               int startSkipLength,
                               int endSkipLength)

Create language embedding without joining of the embedded sections.

Throws:: IllegalStateException - if token() returns null.
See Also:: createEmbedding(Language, int, int, boolean)

createEmbedding

public boolean createEmbedding(Language<? extends TokenId> embeddedLanguage,
                               int startSkipLength,
                               int endSkipLength,
                               boolean joinSections)

Create language embedding described by the given parameters.
If the underying text input is mutable then this method should only be called within a read lock over the text input.

Parameters:

embeddedLanguage - non-null embedded language

startSkipLength - >=0 number of characters in an initial part of the token for which the language embedding is defined that should be excluded from the embedded section. The excluded characters will not be lexed and there will be no tokens created for them.

endSkipLength - >=0 number of characters at the end of the token for which the language embedding is defined that should be excluded from the embedded section. The excluded characters will not be lexed and there will be no tokens created for them.

joinSections - whether sections with this embedding should be joined across the input source or whether they should stay separate.
For example for HTML sections embedded in JSP this flag should be true:

   <!-- HTML comment start
       <% System.out.println("Hello"); %>
            still in HTML comment --<

Only the embedded sections with the same language path can be joined.

Returns:

true if the embedding was created successfully or false if an embedding with the given language already exists for this token.

Throws:

IllegalStateException - if token() returns null.

moveNext

public boolean moveNext()

Move to the next token in this token sequence.

The next token may not necessarily start at the offset where the previous token ends (there may be gaps between tokens caused by token filtering). offset() should be used for offset retrieval.

Returns:: true if the sequence was successfully moved to the next token or false if it was not moved before there are no more tokens in the forward direction.
Throws:: ConcurrentModificationException - if this token sequence is no longer valid because of an underlying mutable input source modification.

movePrevious

public boolean movePrevious()

Move to a previous token in this token sequence.

The previous token may not necessarily end at the offset where the previous token started (there may be gaps between tokens caused by token filtering). offset() should be used for offset retrieval.

Returns:: true if the sequence was successfully moved to the previous token or false if it was not moved because there are no more tokens in the backward direction.
Throws:: ConcurrentModificationException - if this token sequence is no longer valid because of an underlying mutable input source modification.

moveIndex

public int moveIndex(int index)

Position token sequence between index-1 and index tokens.
TS will be positioned in the following way:

          Token[0]   ...   Token[index-1]   Token[index] ...
        ^                ^                ^
 Index: 0             index-1           index

Subsequent moveNext() or movePrevious() is needed to fetch a concrete token in the desired direction.
Subsequent moveNext() will position TS over Token[index] (or movePrevious() will position TS over Token[index-1]) so that token() != null.

Parameters:: index - index of the token to which this sequence should be positioned.
If index >= tokenCount() then the TS will be positioned to tokenCount().
If index < 0 then the TS will be positioned to index 0.
Returns:: difference between requested index and the index to which TS is really set.
Throws:: ConcurrentModificationException - if this token sequence is no longer valid because of an underlying mutable input source modification.

moveStart

public void moveStart()

Move the token sequence to be positioned before the first token.
This is equivalent to moveIndex(0).

moveEnd

public void moveEnd()

Move the token sequence to be positioned behind the last token.
This is equivalent to moveIndex(tokenCount()).

move

public int move(int offset)

Move token sequence to be positioned between index-1 and index tokens where Token[index] either starts at offset or "contains" the offset.

        +----------+-----+----------------+--------------+------
        | Token[0] | ... | Token[index-1] | Token[index] | ...
        | "public" | ... | "static"       | "int"        | ...
        +----------+-----+----------------+--------------+------
        ^                ^                ^
 Index: 0             index-1           index
 Offset:                                  ---^ (if offset points to 'i','n' or 't')

Subsequent moveNext() or movePrevious() is needed to fetch a concrete token.
If the offset is too big then the token sequence will be positioned behind the last token.

If token filtering is used there may be gaps that are not covered by any tokens and if the offset is contained in such gap then the token sequence will be positioned before the token that follows the gap.

Parameters:: offset - absolute offset to which the token sequence should be moved.
Returns:: difference between the reqeuested offset and the start offset of the token before which the the token sequence gets positioned.
Throws:: ConcurrentModificationException - if this token sequence is no longer valid because of an underlying mutable input source modification.

isEmpty

public boolean isEmpty()

Check whether this TS contains zero tokens.
This check is strongly preferred over tokenCount() == 0.

See Also:: tokenCount()

tokenCount

public int tokenCount()

Return total count of tokens in this sequence.
Note: Calling this method will lead to creation of all the remaining tokens in the sequence if they were not yet created.

Returns:: total number of tokens in this token sequence.

subSequence

public TokenSequence<T> subSequence(int startOffset)

Create sub sequence of this token sequence that only returns tokens above the given offset.

Parameters:: startOffset - only tokens satisfying tokenStartOffset + tokenLength > startOffset will be present in the returned sequence.
Returns:: non-null sub sequence of this token sequence.

subSequence

public TokenSequence<T> subSequence(int startOffset,
                                    int endOffset)

Create sub sequence of this token sequence that only returns tokens between the given offsets.

Parameters:: startOffset - only tokens satisfying tokenStartOffset + tokenLength > startOffset will be present in the returned sequence.; endOffset - >=startOffset only tokens satisfying tokenStartOffset < endOffset will be present in the returned sequence.
Returns:: non-null sub sequence of this token sequence.

toString

public String toString()

Overrides:: toString in class Object

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

org.netbeans.modules.lexer/2 1.19.0 1

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

TokenSequence (Lexer) - NetBeans API Javadoc (Current Development Version)

org.netbeans.api.lexer Class TokenSequence<T extends TokenId>

language

languagePath

token

offsetToken

offset

index

embedded

embedded

createEmbedding

createEmbedding

moveNext

movePrevious

moveIndex

moveStart

moveEnd

move

isEmpty

tokenCount

subSequence

subSequence

toString

org.netbeans.api.lexer
Class TokenSequence<T extends TokenId>