Building a Scalable Search Architecture
Avital Trifsik
0 replies
Scaling architecture patterns
Primary/replicas architecture
A primary/replica architecture is used to support a large number of reads. It involves replicating the data of the main server to several replicas. This architecture allows you to process several times more requests than if you use only one copy of the data on one server. For example, if you use three copies of data on three servers, then you can process three times more requests than with one copy on one server.
To support more write operations, you need to split the data into several smaller parts and add more CPU to create these parts. Scaling reads and writes requires adding shards and having multiple copies on multiple machines.
To implement a primary/replica architecture that replicates the master server's data across multiple replicas, each shard needs to have one version that accepts writes. Other replicas use the primary as a source of truth. A log file that is stored on the primary shard is often used to synchronize master data and replicas. This log file contains all the writes received by the primary shard in sequential order. Each replica shard reads writes from this log and applies them locally.
The main disadvantage of this approach is the co-location of indexing and search on the same machine. Since we need to create multiple copies of the data, the CPU and memory indexing needs to be duplicated. This greatly increases costs. If you need to scale indexing and search at the same time, then the replication factor increases and applies to more data, requiring a huge amount of additional resources.
Increases in CPU and memory usage for indexing can also negatively affect the end-user experience if the same computer processes both indexing and search queries. If the search traffic generated by end users increases dramatically, then the resources used for indexing may limit the ability to account for this traffic surge.
In addition, this approach limits the auto-scaling capabilities, since adding a new replica requires pulling data from existing machines, often takes several hours, and adds additional load on the machines. This results in you having to greatly increase the size of your architecture and expect a significant increase in data or queries.
Replication of a binary data structure
Another way to create a scalable search architecture is to replicate a binary data structure committed to disk after an indexing task has been completed.
This approach avoids duplicating the CPU and memory used for indexing. However, while overwriting all data structures, the binaries can be large, resulting in some delay.
Most often, search architectures process a large amount of indexing and searching operations at a rate of less than a minute. Therefore, in most cases, this approach is not used.
In addition, search engines rely on generational data structures. This means that instead of a single binary file, there is a set of files. When a segment receives new indexing operations, it is stored in a smaller data structure on disk. New indexing operations are performed in generation zero until parts of the files reach a certain size and must be merged with generation 1. This is necessary to remove duplicates and optimize search efficiency. The disadvantage of this approach is that all files on the disk are modified and each replica needs to get a new version containing all of the data shards.
The process of merging data and its transfer for replication will be influenced by such factors:
-The shard size determines the maximum amount of data to transfer after all layers have been merged.
-The number of generations. This will directly affect the merger frequency of all generations (the number of times we have to transfer the maximum data size).
Search architecture main services
To create a search architecture you should use three services: crawlers, webpage processors, and indexing.
Crawlers are bots that are used to visit a web page, get all the links that are on it, and follow those links. This allows search engines to constantly find new content.
Web page processors read the page content and metadata. Then you need to break down the content of the web page into simpler forms that can be grouped according to different criteria. For example, by topics, keywords, etc. Metadata contains useful information such as keywords, descriptions, and more.
Indexing is used to organize found information so that it can be read quickly and easily. You can use keywords and page rank for this purpose. However, more efficient indexing requires some research and development.
🤔
No comments yet be the first to help
No comments yet be the first to help