Skip to main content

Indexes on DB2

 Concept of Indexes:


An index in DB2 is a separate structure that contains a copy of selected columns or fields from a table, along with a reference to the location of the corresponding rows in the table.

Indexes are organized in a specific data structure (such as a B-tree or hash table) that allows for quick lookup and retrieval of data based on the indexed columns.

Each index entry typically consists of the indexed column values and a pointer to the actual data location.


Purpose and Benefits of Indexes:


Improved Data Retrieval Performance: Indexes allow for faster data retrieval by providing a direct path to the desired rows, reducing the need for full table scans.


Efficient Query Execution: Indexes enable the query optimizer to choose efficient access paths, such as index scans or index-based joins, leading to faster query execution times.


Reduced Disk I/O: By narrowing down the search space, indexes can minimize the number of disk blocks read during data retrieval, resulting in lower I/O overhead.


Support for Constraints and Uniqueness: Indexes can enforce uniqueness constraints by ensuring that indexed columns contain unique values or combinations, promoting data integrity.


Ordering and Sorting: Indexes can be created on specific columns to enforce a particular sort order, facilitating faster sorting and ordered result retrieval.


Types of Indexes in DB2:


B-tree Indexes: The most common type, which organizes index entries in a balanced tree structure for efficient range-based searches and equality lookups.


Bitmap Indexes: Used for columns with a limited number of distinct values, where each bit in the index corresponds to a possible column value.


Hash Indexes: Suitable for equality searches, where the index key is hashed to provide direct access to the corresponding data location.


Function-based Indexes: Created based on a function or expression applied to one or more columns, allowing for specialized search capabilities.



CREATE INDEX index_name ON table_name (column1, column2, ...)


In the above syntax:


index_name is the name you assign to the index.

table_name is the name of the table on which the index is being created.

(column1, column2, ...) specifies the column(s) included in the index.


Now let's discuss the different methods of index creation in DB2, considerations to be checked, and tradeoffs involved:


Traditional Index Creation:


This method creates an index on an existing table in a single operation.


Syntax: CREATE INDEX index_name ON table_name (column1, column2, ...)


Online Index Creation:


This method allows the creation of an index while the table remains available for read and write operations.


Syntax: CREATE INDEX index_name ON table_name (column1, column2, ...)ONLINE


Considerations before creating indexes:


Selectivity: Choose columns with high selectivity (few distinct values) and frequently used in search conditions.


Query Analysis: Analyze query patterns to identify frequently executed queries that could benefit from indexes.


Performance Testing: Perform benchmark tests to assess the impact of index creation on query performance.


Disk Space: Consider the additional disk space required to store the index structure and index entries.


Tradeoffs of index creation:


Increased Disk Space: Indexes require additional disk space, so carefully manage storage requirements.


Index Maintenance Overhead: Indexes need to be maintained and updated, which can impact insert/update/delete operations on the table.

Storage Overhead: Indexes consume storage, so consider the impact on backup and restore operations.


Query Performance: While indexes improve query performance, they may introduce overhead during data modification operations due to index updates

.

To mitigate the tradeoffs and ensure optimal index usage:


Regularly monitor and analyze index usage, removing or modifying indexes that are not being used effectively.


Update statistics to ensure the query optimizer has accurate information for optimal index selection.


Regularly maintain indexes, such as rebuilding or reorganizing fragmented indexes.It's important to note that the choice of index creation method, the selection of columns, and the overall index design depend on your specific database and workload characteristics. 


It's recommended to perform thorough testing and analysis, considering the specific requirements and performance goals of your DB2 environment, to make informed decisions regarding index creation and maintenance



Here are some risks and considerations to keep in mind:


Impact on Database Performance:


Creating indexes involves additional disk I/O and index maintenance operations, which can temporarily impact the performance of the database.The duration of the index creation process depends on factors such as the size of the table and the number of rows. It can range from a few seconds to several hours for larger tables.


During the index creation process, there may be increased resource utilization, including CPU, memory, and disk usage, which can affect the overall database performance and response times.


Locking and Blocking:


Depending on the DB2 isolation level and locking configuration, creating an index on a table might require locks on the table, potentially causing blocking or contention with concurrent transactions.

Locking conflicts can impact the availability of the database and might result in transaction delays or timeouts.


Downtime or Maintenance Window:


In some cases, creating indexes on large tables or in heavily loaded production environments might require a maintenance window or a scheduled downtime to minimize the impact on ongoing operations.

It is recommended to perform index creation during periods of low database activity to minimize disruptions.


Backup and Recovery Considerations:


Before making any changes to the production database, it is crucial to ensure that proper backups and recovery mechanisms are in place.

In the unlikely event of issues or unexpected behavior during index creation, having recent backup allows you to restore the database to a known good state.

To mitigate the risks associated with creating indexes on a live production database:


Thoroughly test the index creation process in a non-production environment that closely resembles the production environment to understand the potential impact and performance implications.


Analyze the workload patterns and query requirements to identify the most beneficial indexes and prioritize their creation accordingly.


Consider scheduling the index creation during off-peak hours or low activity periods to minimize disruption to the production system.


Communicate the planned index creation activity to stakeholders and users to manage expectations regarding potential temporary performance degradation.





Comments

You might find these interesting

How to properly Start/Stop SAP system through command line ?

Starting/stopping an SAP system is not a critical task, but the method that most of us follow to achieve this is sometimes wrong. A common mistake that most of the SAP admins do is, making use of the 'startsap' and 'stopsap' commands for starting/stopping the system.  These commands got deprecated in 2015 because the scripts were not being maintained anymore and SAP recommends not to use them as many people have faced errors while executing those scripts. For more info and the bugs in scripts, you can check the sap note 809477.  These scripts are not available in kernel version 7.73 and later. So if these are not the correct commands, then how to start/stop the sap system?  In this post, we will see how to do it in the correct way. SAP SYSTEM VS INSTANCE In SAP, an instance is a group of resources such as memory, work processes and so on, usually in support of a single application server or database server with...

sapstartsrv is not started or sapcontrol is not working

 What is sapstartsrv ? The SAP start service runs on every computer where an instance of an SAP system is started. It is implemented as a service on Windows, and as a daemon on UNIX. The process is called  sapstartsrv.exe   on Windows, and   sapstartsrv   on UNIX platforms. The SAP start service provides the following functions for monitoring SAP systems, instances, and processes. Starting and stopping Monitoring the runtime state Reading logs, traces, and configuration files Technical information, such as network ports, active sessions, thread lists, etc. These services are provided on SAPControl SOAP Web Service, and used by SAP monitoring tools (SAP Management Console,  SAP NetWeaver  Administrator, etc.). For more understanding use this link : https://help.sap.com/doc/saphelp_nw73ehp1/7.31.19/enUS/b3/903925c34a45e28a2861b59c3c5623/content.htm?no_cache=true How to check if it is working or not ? In case of linux , you can simply ps -ef | grep s...

HANA System Replication - Prerequisites & Setup

Hey Folks! Welcome back to Hana high availability blog series. In our last blog we checked out operation & replication modes in hana system replication. If you haven't gone though that blog, you can checkout  this link In this blog we will be talking about the prerequisites of hana replication and it's setup. So let's get started. When we plan to setup hana system replication, we need to make sure that all prerequisite steps have been followed. Let's have a look at these prerequisites. HANA System Replication Prerequisites: Primary & secondary systems should be up & running HDB version of secondary should be greater than or equal to Primary database sever But, for Active/Active(read enabled config), HDB version should be same on both sites. System configuration/ini files should be identical on both sides Replication happe...

HANA hdbuserstore

The hdbuserstore (hana secure user store) is a tool which comes as an executable with the SAP Hana Client package. This secure user store allows you to store SAP HANA connection information, including user passwords, securely on clients. With the help of secure store, the client applications can connect to SAP HANA without the user having to enter host name or logon credentials. You can also use the secure store to configure failover support for application servers in a 3-tier scenario (for example, SAP Business Warehouse) by storing a list of all the hosts that the application server can connect to. To access the system using secure store, there are two connect options: (1)key and (2)virtualHostName. key is the hdbuserstore key that you use to connect to SAP HANA, while virtualHostName specifies the virtual host name. This option allows you to change where the hdbuserstore searches for the data and key files. Note...

ST03N : The chapter for all BASIS Admins

This blog is targeted to BASIS ADMINS Transaction for workload analysis statistical data changed over time are monitored using transaction code ST03 , now ST03N (from SAP R/3 4.6C) . With SAP Web AS 6.4 the transaction ST03 is available again. From time to time ST03 and ST03N has seen many changes but later in SAP NW7.0 ST03N has reworked in detail specially processing time is now shown in separate column. Main Use of ST03N  is to get detailed information on performance of any ABAP based SAP system. Workload monitor analyzes the statistical data originally collected by kernel. You can compare or analyze the performance of a single application server or multiple application server. Using this you start checking from the entire system and finding your way to that one application server and narrowing down to exact issue. By Default :- You see data of current day as default view , you can change the default view. Source of the image : sap-perf.ca Let's discuss the WORKLO...

SAP application log tables: BALHDR (Application Log: Header Data) and BALDAT (Application Log: Detail Data)

  BALHDR (Application Log: Header Data): Usage : The BALHDR table stores the header information for application logs. It serves as a central repository for managing and organizing log entries. Example Data Stored: The table may contain entries for various system activities, such as error messages, warnings, or information logs generated during SAP transactions or custom programs. Columns Involved: LOGNUMBER: Unique log number assigned to each log entry. OBJECT: Identifies the object associated with the log entry (e.g., a program, transaction, or process). SUBOBJECT: Further categorizes the object. USERNAME: User ID of the person who created the log entry. TIME: Date and time when the log entry was created. ADD_OBJECT: Additional information or details related to the log entry. BALDAT (Application Log: Detail Data): Usage : The BALDAT table contains the detailed data for each log entry, linked to the corresponding entry in the BALHDR table. It stores the specific log details an...

Work Process and Memory Management in SAP

Let’s talk about the entire concepts that are related to memory when we talk about SAP Application. Starting with few basic terminologies, Local Memory :  Local process memory, the operating system keeps the two allocation steps transparent. The operating system does the other tasks, such as reserving physical memory, loading and unloading virtual memory into and out of the main memory. Shared Memory :  If several processes are to access the same memory area, the two allocation steps are not transparent. One object is created that represents the physical memory and can be used by various processes. The processes can map the object fully or partially into the address space. The way this is done varies from platform to platform. Memory mapped files, unnamed mapped files, and shared memory are used.  Extended Memory : SAP extended memory is the core of the SAP memory management system. Each SAP work process has a part reserved in its virtual address space for extended memory...

How to resolve Common Error : Standard Template "sap_sm.xls" missing

Hey everyone, putting forward a common error we usually face when we have “ Excel inplace” functionality enabled in our SAP system. This error occurs when validity of the signature of SAP standard templates expired or were incorrectly delivered via support packages. We can reproduce the error by doing as below.. Click on “spreadsheet” icon after any SAP ALV grid view of data is on screen to make this data to export into excel directly from SAP.

ABAP Dumps Analysis

Ever now and then have you heard about ABAP Dumps, We also have a joke everything in temporary in life except ABAP dumps for SAP Consultants. Lets try to understand ABAP dumps from perspective of a SAP BASIS Consultant. Dumps happen when an ABAP program runs and something goes wrong that cannot be handled by the program We have two broad categories of Dumps , In custom program Dumps and SAP provided program Dumps. Dumps that happen in the customer namespace ranges (i.e. own-developed code) or known as Custom Program , can usually be fixed by the ABAP programmer of your team. Dumps that happen in SAP standard code probably need a fix from SAP. You do not have to be an "ABAPer" in order to resolve ABAP dump issues. The common way to deal with them is to look up in ST22 How to correct the error ? Hints are given for the keywords that may be used to search on the note system. Gather Information about the issue  Go to System > Status and Check the Basis SP level as well as info...

SAP HANA System Replication - Operation Mode & Replication Mode

Hey Folks! Welcome back to Hana high availability blog series. In our last blog we checked out what is hana system replication and how it basically works. If you haven't gone through that blog, you can checkout link In this blog we will be talking about the replication modes and operation modes in hana system replication. So let's get started. When we setup the replication and register the secondary site, we need to decide the operation mode & replication mode we want to choose for replication. For now we won't focus on setting up replication as we'll cover it in our next blogs.  Operation Modes in Hana System Replication: There are three operation modes available in system replication: delta_datashipping, logreplay and logreplay_readaccess. Default operation mode is logreplay. 1. Delta_datashipping: In this operation mode initially one full data shipping is done as part of replication setup and then a delta data shipping takes place occasionally in addition to cont...