Introduction - Synthetic Image Authentication

Chapter 4 Synthetic Image Authentication

4.1 Introduction

Besides images taken from the natural world, there are also lots of synthetic images widely used in various applications, such as digital maps, document images, engineering drawings, computer generated graphics, scanned documents and handwritten signatures, and so forth. For example, digital maps are now widely used in different Geographic Information Systems (GIS) on the Internet and handheld devices.

In addition, all kinds of important documents, such as legal documents, financial instruments, certificates and insurance information, have been digitalized and stored.

Due to the wide popularity, the authentication of synthetic images against tampering and forgery is becoming a great concern.

In order to differentiate from graphics, the term synthetic image in this thesis refers to all the simple images that are represented by a few number of color/gray values. The extreme case is binary images that contain only two colors: black and white. Compared with continuous-tone natural images, synthetic images have much fewer colors and no complex texture variation. Unlike natural images, in which pixel values vary in a wide range, the pixels in the synthetic images only take on a limited number of values.

Moreover, in synthetic images, the color and brightness usually change abruptly from

one value to another without any transition, which results in sharp edges. In addition, there are usually large homogenous regions in synthetic images. In these regions, there is only one uniform gray level or color. Hence arbitrarily changing pixels on a synthetic image will cause very visible artifacts. Table 4-1 lists major differences between natural images and synthetic images.

Table 4-1 Comparison of the natural and synthetic images

Characteristics Natural Images Synthetic Images

Color Continuous-tone, plenty of

colors (True-color) A few of colors, limited number of pixel values Texture Complex texture variation Simple texture, plenty of flat

regions Edge

Mild edges with gradual brightness and color transition

Sharp edges with abrupt brightness and color change Storage

Format

True-color formats, e.g.

JPEG, TIFF, BMP

Indexed color or binary formats, e.g. GIF, PNG, TIFF

Figure 4-1 and Figure 4-2 give two examples of synthetic images. Figure 4-1 shows a text image that is a typical kind of synthetic image. The sample image is a binary bitmap that contains only black and white colors and has large blank margins.

When stored in an indexed color format, the color palette will contain only two entries.

In addition, because the pixels take on only two possible values, a binary image is also commonly stored as a bitmap with a color depth of 2. Figure 4-2 presents a digital map as an example of a color synthetic image. The sample map does not look like a very simple image as it contains many lines, curves and symbols in different colors. In this map, however, there are actually only seven distinct colors, as shown in the color palette on the right. Furthermore, lots of regions in the map are filled with uniform colors and contain no texture at all. One magnified smooth part is shown below in Figure 4-2, in which most of the area is filled with pure white color. Note that synthetic

images, such as digital maps, can also be stored in vector image formats. In this chapter, we only consider synthetic images stored as bitmap images.

Because of its unique property, invisibly embedding data in a synthetic image becomes a more challenging task. On the other hand, unfortunately, due to the simplicity of the content, it is much easier for an adversary to manipulate a synthetic image than a continuous-tone natural image. An adversary even does not need any powerful image-editing software like Adobe Photoshop because a simple image modification tool is enough to make a perfect forgery without leaving any noticeable traces on the original synthetic image.

Although a variety of watermarking schemes for image authentication have been proposed, most of them are developed for color and grayscale natural images and can

Close view of the binary text image

Figure 4-1 Synthetic image example A: binary text image

not be applied to the synthetic images directly. In those schemes, the watermark information is commonly embedded by changing the least significant bits (LSB) of the pixel values [W98][FGB00][F02] or slightly modifying the transform coefficients [WL98][F99][WKBC02]. Since synthetic images contain plenty of sharp edges and smooth areas, such kinds of modifications will either introduce visible artifacts or significantly decrease the reliability of the embedded watermark because of the weak embedding strength.

Another problem caused by common watermarking schemes for natural images is that new colors will be introduced into the cover image. As synthetic images only contain a limited number of colors, they are usually stored in indexed color formats (e.g. GIF, Graphics Interchange Format) instead of true the color formats (e.g. JPEG). The indexed color formats use a color palette to indicate different used colors. The number of colors that an indexed color format can store is usually limited and depends on the size of the used color palette. For example, the GIF format uses a color palette that can

Smooth area example

Color palette (7 colors) Color digital map

Figure 4-2 Synthetic image example B: color digital map

contain 256 distinct colors. The classical watermarking approaches for natural images embed the watermark by slightly changing the pixel values, which will inevitably introduce new pixel values, i.e. new colors. It becomes even worse for the approaches that embed data in the transform domains by modifying the frequency coefficients, e.g.

in DCT or DWT domains, because in this case it is very hard to predict and control the number of the introduced new colors. Because the introduction of the new colors will change the entries of the original image palette, from the compatibility point of view, it is very undesirable to introduce additional pixel values to the synthetic images in most of the applications. Moreover, when so many additional colors are introduced that the total number of the colors exceeds the palette’s capacity, it becomes impossible to store the watermarked image in the original format.

Therefore, for synthetic image authentication, specific watermarking schemes must be designed to handle the above-mentioned requirements. Generally speaking, a watermarking scheme for synthetic image authentication should satisfy the following listed requirements:

1. Watermark transparency: no noticeable artifacts should be introduced, i.e. high image quality must be achieved after watermark embedding.

2. Format compatibility: no additional color should be introduced in the watermarked image in order to keep the color palette intact.

3. Tamper localization capability: the authenticator should be able to localize the tampered region with a resolution as high as possible.

4. Recovery capability: it is a very desirable feature that the authenticator is able to recover the original content in tampered regions.

5. Blind detection: the embedded watermark can be extracted without referring to the original image.

In this chapter, we propose a novel watermarking scheme to authenticate the content

are specified compared to the other existing schemes so that the quality of the watermark image gets improved. Moreover, in the embedding process, no additional pixel value will be introduced. A random permutation process is applied to the whole image before embedding the watermark bits. The watermark information is embedded in the permuted image domain and every embedded watermark bit is utilized to monitor a group of pixels so that all pixels of the image instead of blocks are identified by much less watermark bits. Combining random permutation and statistical tamper detection, the proposed scheme achieves pixel-wise tamper localization capability. We present a new embedding strategy that enables the recovery capability of the authentication system. Hence, in the authentication process, not only can the proposed scheme localize the tampered area but it also can recover the removed content and identify the forged parts. Experimental results demonstrate the capability of the proposed scheme to localize and recover tampered areas in watermarked images. The proposed scheme can be applied to various kinds of synthetic images, including binary images or images with few colors.

The organization of this chapter is as follows. Firstly, in Section 4.2, we retrospect the previous work related to the authentication and data hiding for the synthetic images and address the unsolved problems and challenges of synthetic image authentication. Then, in Section 4.3, we introduce the proposed watermarking scheme, including the watermark embedding and retrieval processes. The authentication process is presented in Section 4.4. Afterwards, we analyze the proposed scheme’s performance and security issues in Section 4.5. Experimental results are given in Section 4.6. In Section 4.7, we discuss the possible extension of the proposed embedding strategy. Finally, we conclude the chapter in Section 4.8.

Im Dokument Digital Watermarking for Image Content Authentication (Seite 91-96)