Home | Resume | My calendar | PublicationsDemo | LSSD | LIRMMBase |  PPG-LIRMM-COLORJob opportunities | Lecture support | Contact


LIRMMBase


LSSD : Large Scale Steganalysis Database is a database of 2 million JPEG (color or greyscale) 256x256 images.

Webpage published in September 2020 and "alive" in January 2021.

Here is the material for:
- downloading the database thanks script: LSSD_download_script.sh;
  Otherwise, direct access via the dedicated website is possible: http://lssd.lirmm.fr/
  Note that a GitHub is also available (https://github.com/Yiouki/download_lssd)
- description of the LSSD folder hierarchy: DB_structure.pdf
- re-developing the RAW images (inspired from Alaska#1 script): jpeg_base_generator.zip,
- using the LC network (Low Complexity - paper - github) CNN: low_complexity_tf.zip.

Any use of this database must be mentioned by citing the following papers:

LSSD: a Controlled Large JPEG Image Database for Deep-Learning-based Steganalysis ”into the Wild” ', Hugo RUIZ, Mehdi YEDROUDJ, Marc CHAUMONT, Frédéric COMBY, Gérard SUBSOL, Proceeding of the 25th International Conference on Pattern Recognition, ICPR'2021, Worshop on MultiMedia FORensics in the WILD, MMForWILD'2021, Virtual Conference due to Covid (formerly Milan, Italy), Lecture Notes in Computer Science, LNCS 12666, Springer, Book: Pattern Recognition and Information Forensics, Chapter No: 38, Pages 470-483, Chapter DOI: 10.1007/978-3-030-68780-9_38, January 10-15, 2021, 14 pages, https://iplab.dmi.unict.it/mmforwild/. paper, slides, video-lirmm.

Alaska#2: "The ALASKA Steganalysis Challenge: A First Step Towards Steganalysis". Cogranne, R., Giboulot, Q., Bas, P. In: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security. pp. 125–137. IH&MMSec’2019, Paris, France (Jul 2019).

StegoApp: "StegoAppDB: a Steganography Apps Forensics Image Database". Newman, J., Lin, L., Chen, W., Reinders, S., Wang, Y., Wu, M., Guan, Y. In: Proceedings ofMedia Watermarking, Security, and Forensics, MWSF’2019, Part of IS&T International Symposium on Electronic Imaging, EI’2019. Ingenta, Burlingame, California,USA (Jan 2019).

BOSS: "Break Our Steganographic System: The Ins and Outs of Organizing BOSS". Bas, P., Filler, T., Pevny, T. In: Proceedings of 13th International Conference on In-formation Hiding, IH’2011. Lecture Notes in Computer Science, Springer, vol. 6958,pp. 59–70. Prague, Czech Republic (May 2011).

Raise : "RAISE – A Raw ImagesDataset for Digital Image Forensics". Dang-Nguyen, D.T., Pasquini, C., Conotter, V., Boato, G. In: Proceedings of ACM Multimedia Systems.Portland, Oregon (March 2015).

Dresden : "The  ‘Dresden  Image  Database’  for  Benchmarking  DigitalImage Forensics". Gloe,  T.,  Bohme,  R.:  In: Proceedings of the 25th Symposium On Applied Computing (ACM SAC 2010). vol. 2, pp. 1585–1591 (2010).


Diagram of the RAW image development process
Fig: Diagram of the RAW image development process

Developed image 3786 from the ALASKA database Developed image 6456 from the BOSS database
Developed image 3786 from the ALASKA database Developed image 6456 from the BOSS database
Fig: Sample images, 256×256 in grayscale, obtained after the development process
Developed image 51336 from the ALASKA database Developed image from the Wesaturate database
Developed image 51336 from the ALASKA database Developed image  from the Wesaturate database (index ZYlVRQYDWE)
Fig: Sample images, 256×256 in colour, obtained after the development process

--

The LSSD database contains:

- The RAW_BASE: We gathered different sources of RAW images, and you can download them separately thanks to the script (LSSD_download_script.sh). Note that we decided to store all those RAW images on our Website (http://lssd.lirmm.fr/RAW/), in order to archive the different sources and help the scientific community.
The sources are ALASKA 2, BOSS, Dresden, RAISE, Stego App, and Wesaturate, (some of them are accessible on the associated website) :

Base name Number of images Percentage Command to download
ALASKA 2 80 005 62.88 % sh LSSD_download_script.sh RAW_ALASKA2
BOSS 10 000 7.66 % sh LSSD_download_script.sh RAW_BOSS
Dresden 1 491 1.23 % sh LSSD_download_script.sh RAW_Dresden
RAISE 8 156 6.41 % sh LSSD_download_script.sh RAW_RAISE
Stego App 24 120 18.96 % sh LSSD_download_script.sh RAW_StegoApp
Wesaturate 3 648 2.87 % sh LSSD_download_script.sh RAW_Wesaturate

  
- LSSD_JPEG_256x256_2M: The LSSD (you can select the color or grey-level image database) database obtained by developing the RAW_BASE (see the paper for explanations). There are 2 millions of images either in JPEG or in MAT format (MAT = decompressed and in double precision values).
Command to download JPEG: sh LSSD_download_script.sh -b LSSD_2M -t JPEG -c Color (or Gray) -n Cover
Note: command to download MAT: sh LSSD_download_script.sh -b LSSD_2M -t MAT -c Color (or Gray) -n Cover.

- LSSD_JPEG_256x256_1M: Images from LSSD_JPEG_256x256_1M are all included in LSSD_JPEG_256x256_2M. Command to download JPEG or MAT: sh LSSD_download_script.sh -b LSSD_1M -t JPEG (or MAT) -c Color (or Gray) -n Cover.

- LSSD_JPEG_256x256_500K: Images from LSSD_JPEG_256x256_500K are all included in LSSD_JPEG_256x256_1M. Command to download JPEG or MAT: sh LSSD_download_script.sh -b LSSD_500k -t JPEG (or MAT) -c Color (or Gray) -n Cover.

-LSSD_JPEG_256x256_100K: Images from LSSD_JPEG_256x256_100K are all included in LSSD_JPEG_256x256_500K.
Command to download JPEG or MAT: sh LSSD_download_script.sh -b LSSD_100k -t JPEG (or MAT) -c Color (or Gray) -n Cover.

- LSSD_JPEG_256x256_50K: Images from LSSD_JPEG_256x256_50K are all included in LSSD_JPEG_256x256_100K.
Command to download JPEG or MAT: sh LSSD_download_script.sh -b LSSD_50k -t JPEG (or MAT) -c Color (or Gray) -n Cover.

- LSSD_JPEG_256x256_10K: Images from LSSD_JPEG_256x256_10K are all included in LSSD_JPEG_256x256_50K. Command to download JPEG or MAT: sh LSSD_download_script.sh -b LSSD_10k -t JPEG (or MAT) -c Color (or Gray) -n Cover.

- LSSD_JPEG_256x256_TEST: This database is disjoint from the LSSD (The RAW which have been used in order to generate this test database are different from those used in order to generate the LSSD_JPEG_256x256_2M).
Command to download JPEG or MAT: sh LSSD_download_script.sh -b TST_100k -t JPEG (or MAT) -c Color (or Gray) -n Cover.

--

For a rapid use, we also gave below the stego database with an embedding with JUNIWARD at 0.2 bpnzacs:

- LSSD_JPEG_256x256_2M_STEGO.
Command to download JPEG or MAT (only in grey-level): sh LSSD_download_script.sh -b LSSD_2M -t JPEG (or MAT) -c Gray -n Stego_P02.

- LSSD_JPEG_256x256_1M_STEGO.
Command to download JPEG or MAT (only in grey-level): sh LSSD_download_script.sh -b LSSD_1M -t JPEG (or MAT) -c Gray -n Stego_P02.

- LSSD_JPEG_256x256_500K_STEGO.
Command to download JPEG or MAT (only in grey-level): sh LSSD_download_script.sh -b LSSD_500k -t JPEG (or MAT) -c Gray -n Stego_P02.

- LSSD_JPEG_256x256_100K_STEGO. 
Command to download JPEG or MAT (only in grey-level): sh LSSD_download_script.sh -b LSSD_100k -t JPEG (or MAT) -c Gray -n Stego_P02.

- LSSD_JPEG_256x256_50K_STEGO. 
Command to download JPEG or MAT (only in grey-level): sh LSSD_download_script.sh -b LSSD_50k -t JPEG (or MAT) -c Gray -n Stego_P02.

- LSSD_JPEG_256x256_10K_STEGO: Images from LSSD_JPEG_256x256_10K are all included in LSSD_JPEG_256x256_50K.
Command to download JPEG or MAT (only in grey-level): sh LSSD_download_script.sh -b LSSD_50k -t JPEG (or MAT) -c Gray -n Stego_P02.

--

- LSSD_JPEG_256x256_TEST_STEGO:
Command to download JPEG or MAT (only in grey-level): sh LSSD_download_script.sh -b TST_100k -t JPEG (or MAT) -c Gray -n Stego_P02.

--

Few words on the project:

The objective of our project is to study the behavior of steganalysis with huge learning bases and in controlled conditions.

The original images are in RAW format. This is the information perceived by the camera physical sensor when the image is captured. It gives large files which are quite rare to find on the Internet. We give you access to all the RAW images, sorted according to their origin.

The database is developed with usual parameters. There are several sizes (10k, 50k, 100k...) in different formats (JPEG or MAT). It is also possible to choose to work with color or grayscale images.

The given scripts are the ones we used throughout our project. One is especially for you: it will allow you to download any LSSD database part. One of the scripts allows you to develop JPEG images from RAW images. And the last one allows the Low Complexity neural network to be leart on the selected database. 

HOW TO CITE the LSSD database:

If researchers use this free-access database, they are required to cite those papers:

LSSD: a Controlled Large JPEG Image Database for Deep-Learning-based Steganalysis ”into the Wild” ', Hugo RUIZ, Mehdi YEDROUDJ, Marc CHAUMONT, Frédéric COMBY, Gérard SUBSOL, Proceeding of the 25th International Conference on Pattern Recognition, ICPR'2021, Worshop on MultiMedia FORensics in the WILD, MMForWILD'2021, Virtual Conference due to Covid (formerly Milan, Italy), Lecture Notes in Computer Science, LNCS 12666, Springer, Book: Pattern Recognition and Information Forensics, Chapter No: 38, Pages 470-483, Chapter DOI: 10.1007/978-3-030-68780-9_38, January 10-15, 2021, 14 pages, https://iplab.dmi.unict.it/mmforwild/. paper, slides, video-lirmm.

Alaska#2: "The ALASKA Steganalysis Challenge: A First Step Towards Steganalysis". Cogranne, R., Giboulot, Q., Bas, P. In: Proceedings of the ACM Workshop on Information Hiding and Multimedia Security. pp. 125–137. IH&MMSec’2019, Paris, France (Jul 2019).

StegoApp: "StegoAppDB: a Steganography Apps Forensics Image Database". Newman, J., Lin, L., Chen, W., Reinders, S., Wang, Y., Wu, M., Guan, Y. In: Proceedings ofMedia Watermarking, Security, and Forensics, MWSF’2019, Part of IS&T International Symposium on Electronic Imaging, EI’2019. Ingenta, Burlingame, California,USA (Jan 2019).

BOSS: "Break Our Steganographic System: The Ins and Outs of Organizing BOSS". Bas, P., Filler, T., Pevny, T. In: Proceedings of 13th International Conference on In-formation Hiding, IH’2011. Lecture Notes in Computer Science, Springer, vol. 6958,pp. 59–70. Prague, Czech Republic (May 2011).

Raise : "RAISE – A Raw ImagesDataset for Digital Image Forensics". Dang-Nguyen, D.T., Pasquini, C., Conotter, V., Boato, G. In: Proceedings of ACM Multimedia Systems.Portland, Oregon (March 2015).

Dresden : "The  ‘Dresden  Image  Database’  for  Benchmarking  DigitalImage Forensics". Gloe,  T.,  Bohme,  R.:  In: Proceedings of the 25th Symposium On Applied Computing (ACM SAC 2010). vol. 2, pp. 1585–1591 (2010).

Related work:

' Analysis of the Scalability of a Deep-Learning Network for Steganography “Into the Wild” ', Hugo RUIZ, Marc CHAUMONT, Mehdi YEDROUDJ, Ahmed OULAD AMARA, Frédéric COMBY, Gérard SUBSOL, Proceeding of the 25th International Conference on Pattern Recognition, ICPR'2021, Worshop on MultiMedia FORensics in the WILD, MMForWILD'2021Virtual Conference due to Covid (formerly Milan, Italy), Lecture Notes in Computer Science, LNCS 12666, Springer, Book: Pattern Recognition and Information Forensics, Chapter No: 36, Pages 439-452, Chapter DOI: 10.1007/978-3-030-68780-9_36, January 10-15, 2021, 14 pages, https://iplab.dmi.unict.it/mmforwild/. paper. slides. video-lirmm, video-screencast-o-matic, video-youtube.

Flag Counter