Kentico Document/Page Retrieval Methods

Posted: 8/7/2018 11:21:36 AM by Chris Bass | with 0 comments
Filed under: API, How-To, Performance
Kentico's current API offers a large number of ways to access documents. PageInfoProvider, TreeProvider, and DocumentHelper are the three primary ones, and each is useful in its own set of scenarios.

First, there’s the PageInfoProvider, which gives cached access to a page’s document properties. The three most likely ways you’ll access those are: DocumentContext.CurrentPageInfo, DocumentContext.CurrentParentPageInfos (gets the ancestor pages’ infos, ordered by level), and PageInfoProvider.GetPageInfo(documentguid).
The info that is cached is done according to Kentico’s “Page Info Cache” settings,  and the Node level and Document level (not page-type-specific) data is available. Categories, related documents, and other joined-data are also not available with this object. These objects are read-only and cannot be used for updates/deletes.

UPDATE: Unfortunately, since Kentico 10, the documentguid version of GetPageInfo() been removed - the Obsolete notes from Kentico 9 say that it was because of Linked Documents sharing a DocumentGuid. Unfortunately, the remaining methods for using PageInfoProvider only work well within Context (for getting the current page, or its parent, for example). The remaining GetPageInfo() overrides require supplying a lot of extra data (sitename, culture name, alias path, url path, combine with default culture, optional nodeID for performance boost), and the cache key it generates is exterely detailed, such that for retrieving a cached document it's actually less effort and more reusable to just cache the TreeNode yourself using CacheHelper.Cache() with a custom key, rather than using these methods.

TreeProvider is the class used for retrieving the last published version of data - it’s used when displaying pages to end-users on the live site. It has many methods, but three methods stand out as especially useful  - .SelectNodes(), .SelectSingleDocument(), and .SelectSingleNode(). Because these methods retrieve the published data, not the latest edited data, it can be used for data-retrieval, but should not be used for edits since you aren't necessarily working with the latest version.
TreeProvider can be used with strong typing with the SelectNodes() method: TreeProvider.SelectNodes().OfType<X>()
SelectNodes also gives you an ObjectQuery like DocumentHelper so you can work with all the same query syntax.
You can use TreeProvider safely to retrieve nodes for deletion, but remember that it only retrieves documents that have published versions.
If you’re just retrieving base properties, PageInfoProvider is usually faster due to the page cache, but it has limited methods for *retrieving* data - mostly just GUID unless you want to enter a bunch of data about culture and such - so when I need a lot of data from a set of pages I don’t have the GUIDs for, one thing I’ll do is use TreeProvider to *query* to get the documentGUIDs of the pages I care about, and then use PageInfoProvider to actually retrieve the full records themselves.

DocumentHelper is the way to retrieve the latest edited version of data, and to edit data yourself. Keep in mind, the latest edited data may differ from the published data - for data retrieval that you plan to show on the live site, you probably want to use TreeProvider instead.
DocumentHelper offers a variety of methods for filtering the type such as a generic GetDocuments<X>(), as well as a few non-generic TreeNode-based options: GetDocuments("X"), GetDocuments().Types("X") (which also allows for ("X", "Y", "etc"), GetDocuments().Type("X") (which can be chained to get multiple types), as well as just not filtering on type (GetDocuments()). In addition to this, there are options for limiting what columns you select (.Columns() and .Column()), as well as for expanding what columns you need to include the documents' type-specific columns (WithCoupledColumns()). 

By default, it retrieves the latest edited data, and, if you specify a type (.Types("X"), .Type("X"), GetDocuments<X>()) then it also retrieves the doctype-specific data as well as the basic document-level data. This is also the case if you just use GetDocuments().WithCoupledColumns(), but bear in mind that without filtering on type, you're potentially joining together many tables, and chances are you don't need all of that.

When using a DocumentQuery (SelectNodes() or GetDocuments()) for simply retrieving data that you don't plan on updating/editing, performance can be improved by using .Columns() or .Column(), which limit the dataset to the columns you request. This has the advantage of cutting out potentially hundreds of extra columns, including base document-level columns.
Be sure not to do this when you plan on updating data of the documents - you need all of the columns to make sure you're saving the complete object. If necessary, retrieve the documents with columns filtered, do any read operations that you need, then write a second query to retrieve the full dataset for editing.
Similarly, if you're editing page data make sure you've got the type-specific data - either by having called one of the Types()/Type()/GetDocuments<X> variants, or by explicitly calling  .WithCoupledColumns() on your query. There is a boolean "IsCoupled" on TreeNodes that can tell you if you've retrieved the data in a way that includes the type-specific data. (It won't tell you if you've then also limited the columns with Columns(), you'll have to keep track of that yourself).

It may also make sense when reading data, to only retrieve documents that are currently published - there's a .Published() method for that - note that this does not necessarily bring back the latest published data - TreeProvider is still the way to get the latest published version, and DocumentHelper is still the way to get the latest edited version.

Blog post currently doesn't have any comments.
 Security code