Deduplication in the Backup System with Information Storage in a Database


Cite item

Full Text

Open Access Open Access
Restricted Access Access granted
Restricted Access Subscription Access

Abstract

Prevention of data loss from digital media includes processes such as backup. It can be done manually by copying data to external media or automatically on schedule using special software. There are also remote backup systems, when data are saved over the network to some remote repository. Such systems are multi-user and process large amounts of data. A shared storage can have files containing the same fragments. The elimination of repeated data is based on the mechanism of deduplication. It is a method of information compression, when the search for copies is carried out in the entire dataset rather than within a single file. The main advantage of using this technology is significant saving of disk space. However, the mechanism of eliminating repetitive data can significantly reduce the rate of saving and restoring information. This paper is devoted to the problem of implementing such a mechanism in the backup system with information storage in a relational database. In this work we consider an example of implementation of such a system working in two modes: with and without data deduplication. This paper illustrates a class diagram for the development of the client part of the application as well as the description of tables and their relationships in a database that belongs to the backend. The author proposes an algorithm for saving data with deduplication, and also provides results of comparative tests on the speed of the algorithms for saving and recovering information when working with relational database management systems from various manufacturers.

About the authors

S. M. Taranin

Demidov Yaroslavl State University

Author for correspondence.
Email: staranin0208@yandex.ru
Russian Federation, Yaroslavl, 150003

Supplementary files

Supplementary Files
Action
1. JATS XML

Copyright (c) 2018 Allerton Press, Inc.