CMPT 820   Multimedia Systems
Spring 2019

Reading List

You will be choosing papers from this list for your contributions to our seminar presentations.
(You may also introduce different papers to discuss, with approval.)


    A. Introduction to Digital Image and Video Compression

  1. G.K. Wallace, "The JPEG still picture compression standard", Communications of the ACM, 34(4):30-44, 1991. ([pdf])
  2. J.F. Blinn, "What's the deal with the DCT?", IEEE Computer Graphics and Applications, 13(4):78-83, 1993. ([pdf])
  3. R. Schafer and T. Sikora, "Digital video coding standards and their role in video communications", Proceedings of the IEEE, 83(6):907-924, 1995. ([pdf])
  4. B. MPEG-4/H.264, H.265 and MPEG-7

  5. T. Sikora, "The MPEG-4 video standard verification model", IEEE Trans. on Circuits and Systems for Video Technology, 7(1):19-31, 1997. ([pdf])
  6. T. Wiegand, G.J. Sullivan, G. Bjontegaard, and A. Luthra, "Overview of the H.264/AVC video coding standard", IEEE Trans. on Circuits and Systems for Video Technology, 13(7):560-576, 2003. ([pdf])
  7. G.J. Sullivan, et al., "Overview of the High Efficiency Video Coding (HEVC) Standard", IEEE Trans. on Circuits and Systems for Video Technology, 22(12):1649-1668, 2012. ([pdf])
  8. T. Sikora, "The MPEG-7 visual standard for content description-an overview", IEEE Trans. on Circuits and Systems for Video Technology, 11(6):696-702, 2001. ([pdf])
  9. C. Wavelets and JPEG-2000

  10. S. Mallat, "A theory for multiresolution signal decomposition: the wavelet representation", IEEE Trans. on Pattern Analysis and Machine Intelligence, 11(7):674-693, 1989. ([pdf])
  11. M. Antonini, et al., "Image coding using wavelet transform", IEEE Trans. on Image Processing, 1(2):205-221, 1992. ([pdf] )
  12. J.M. Shapiro, "Embedded image coding using zerotrees of wavelet coefficients", IEEE Trans. on Signal Processing, 41(12):3445-3462, 1993. ([pdf])
  13. C. Christopoulos, A. Skodras, and T. Ebrahimi, "The JPEG2000 still image coding system: An overview", IEEE Trans. on Consumer Electronics, 46(4):1103-1127, 2000. ([pdf] )

    D. Image and Video Quality Assessment

  14. Z. Wang and A.C. Bovik, "Mean squared error: Love it or leave it?", IEEE Signal Processing Magazine, 26(1):98-117, 2009. ([pdf] )
  15. S. Chikkerur, et al., "Objective video quality assessment methods: A classification, review, and performance comparison", IEEE Trans. on Broadcasting, 57(2):165-182, 2011. ([pdf] )
  16. S.J. Daly, R.T. Held and D.M. Hoffman, "Perceptual issues in stereoscopic signal processing", IEEE Trans. on Broadcasting, 57(2):347-361, 2011. ([pdf] )
  17. S. Bosse, et al., "Deep neural networks for no-reference and full-reference image quality assessment", IEEE Trans. on Image Processing, 27(1):206-219, 2018. ([pdf])
  18. E. Content Based Image and Video Retrieval

  19. A.W.M. Smeulders, et al., "Content-based image retrieval at the end of the early years", IEEE Trans. on Pattern Analysis and Machine Intelligence, 22(12):1349-1380, 2000. ([pdf])
  20. P. Papadakis, et al. "Panorama: A 3D shape descriptor based on panoramic views for unsupervised 3D object retrieval", Int. Journal of Computer Vision, 89(2-3):177-192, 2010. ([pdf] )
  21. M. Tzelepi and A. Tefas, "Deep convolutional learning for content based image retrieval", Neurocomputing, Vol. 275, pp. 2467-2478, 2018. ([pdf])
  22. Z. Li, et al., "Large scale retrieval for medical image analytics: A comprehensive review", Medical Image Analysis, Vol. 43, pp. 66-84, 2018. ([pdf])
  23. F. Visual Content Analysis

  24. J. Shotton, et al., "Efficient human pose estimation from single depth images", IEEE Trans. on Pattern Analysis and Machine Intelligence, 35(12):2821-2840, 2013 ([pdf]). Short and original CVPR2011 paper ([pdf]). ([Supplementary Materials]) ([ppt])
  25. R. Girshick, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation", Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR2014), 2014. ([pdf] )
  26. K. He, et al., "Mask R-CNN", Proc. Int. Conf. on Computer Vision (ICCV 2017), Marr Prize paper, 2017. ([pdf] )
  27. G. Huang, et al., "Densely connected convolutional networks", Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR2017), Best Paper, 2017. ([pdf] )
  28. H. Joo, et al., "Total capture: a 3D deformation model for tracking faces, hands, and bodies", Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR2018), Best Student Paper, 2018. ([pdf] )
  29. J.R. Smith, et al., "Harnessing A.I. for augmenting creativity: Application to movie trailer creation", In Proc. ACM Multimedia 2017, Best Brave New Idea Paper, 2017. ([pdf])
  30. Y. Huang, et al., "Video Super-Resolution via Bidirectional Recurrent Convolutional Networks", IEEE Trans. on Pattern Analysis and Machine Intelligence, 40(4):1015-1028, 2018. ([pdf] )
  31. L. Zhang, H. Dong, and A. El Saddik, "From 3D sensing to printing: A survey", ACM Trans. on Multimedia Computing, Communications, and Applications, 12(2):Article 27, pp. 1-22, 2015. ([pdf])
  32. B. Liu, et al., "Beyond narrative description: Generating poetry from images by multi-adversarial training", In Proc. ACM Multimedia 2018, Best Paper, 2018. ([pdf])
  33. G. Digital Audio Compression

  34. D. Pan, "A tutorial on MPEG/audio compression", IEEE Multimedia, 2(2):60-74, 1995. ([pdf])
  35. S. Shlien, "Guide to MPEG-1 audio standard", IEEE Trans. on Broadcasting, 40(4):206-218, 1994. ([pdf])
   

New Suggested Papers:

  1. F. Jiang, et al., "An end-to-end compression framework based on convolutional neural networks", IEEE Trans. on Circuits and Systems for Video Technology, 28(10):3007-3018, 2018. ([pdf]), also a ( blog post) summarizing the paper.
  2. Z. Liu, et al., "DeepN-JPEG: A deep neural network favorable JPEG-based image compression framework", 2018. ([pdf])
  3. L. Galteri, et al., "Deep generative adversarial compression artifact removal", Proc. Int. Conf. on Computer Vision (ICCV 2017), 2017. ([pdf])
  4. R. Yang, et al., "Enhancing quality for HEVC compressed videos", 2018. ([pdf])
  5. Z. Cao, et al., "Realtime multi-person 2D pose estimation using part affinity fields", Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR2017), 2017. ([pdf])
   

Reference Papers:

Digital Image and Video

  1. J.F. Blinn, "NTSC: Nice Technology, Super Color", IEEE Computer Graphics and Applications, 13(2):17-23, 1993. ([pdf])
  2. J.F. Blinn, "The world of digital video", IEEE Computer Graphics and Applications, 12(5):106-112, 1992. ([pdf])

MPEG and H.264

  1. D. Le Gall, "MPEG: a video compression standard for multimedia applications", Communications of the ACM, 34(4):46-58, 1991. ([pdf])
  2. G.J. Sullivan and T. Wiegand, "Video compression -- From concepts to the H.264/AVC standard", Proceedings of the IEEE, 93(1):18-31, 2005. ([pdf])
  3. D. Marpe, H. Schwarz, and T. Wiegand, "Context-based Adaptive Binary Arithmetic Coding in the H.264/AVC video compression standard", IEEE Trans. on Circuits and Systems for Video Technology, 13(7):620-636, 2003. ([pdf])
  4. V. Sze and M. Budagavi, "High Throughput CABAC Entropy Coding in HEVC", IEEE Trans. on Circuits and Systems for Video Technology, 22(12):1778-1791, 2012. ([pdf])
  5. H.S. Malvar, et al., "Low-complexity transform and quantization in H.264/AVC", IEEE Trans. on Circuits and Systems for Video Technology, 13(7):598-603, 2003. ([pdf])
  6. P. List, et al., "Adaptive deblocking filter", IEEE Trans. on Circuits and Systems for Video Technology, 13(7):614-619, 2003. ([pdf])
  7. T. Takishima, M. Wada and H. Murakami, "Reversible variable length codes", IEEE Trans. on Communications, 43(2/3/4):158-162,1995. ([pdf])
  8. J. Ostermann, E.S. Jang, J. Shin, and T. Chen, "Coding of arbitrarily shaped video objects in MPEG-4", In Proc. Int. Conf. on Image Processing (ICIP '97), 496-499, 1997. ([pdf])

Wavelets and JPEG-2000

  1. P.J. Burt, "The Laplacian pyramid as a compact image code", IEEE Trans. on Communications, COM-31(4):532-540, 1983. ([pdf])
  2. D. Taubman, "High performance scalable image compression with EBCOT", IEEE Trans. on Image Processing, 9(7):1158-1170, 2000. ([pdf])
  3. B.E. Usevitch, "A tutorial on modern lossy wavelet image compression: Foundations of JPEG2000", IEEE Signal Processing Magazine, 18(5), 22-35, 2001. ([pdf])

  4. O. Rioul and M. Vetterli, "Wavelets and signal processing", IEEE Signal Processing Magazine, 8(4):14-38, Oct. 1991. ([pdf])
  5. A A. Said and W.A. Pearlman, "A new, fast, and efficient image codec based on set partitioning in hierarchical trees", IEEE Trans. on Circuits and Systems for Video Technology, 6(3):243-250, 1996. ([pdf])
  6. D. Taubman, "Embedded block coding in JPEG2000", Proc. Int. Conf. on Image Processing (ICIP '2000), Vol. II, 33-36, 2000. ([pdf])
  7. K. Varma and A. Bell, "JPEG2000 -- Choices and tradeoffs for encoders", IEEE Signal Processing, 21(6):70-75, 2004. ([pdf] )
  8. D. Santa Cruz, and T. Ebrahimi, "A Study of JPEG 2000 Still Image Coding Versus Other Standards", ISO/IEC JTC1/SC29/WG1 (ITU-T SG8), 2000.

3D Video and TV

  1. D. Scharstein and R. Szeliski. "A taxonomy and evaluation of dense two-frame stereo correspondence algorithms," Int. Journal of Computer Vision, Vol. 47, pp. 7-42, 2002. ([pdf]) [Web site with data and source code]

  2. Y. Boykov, O. Veksler, and R. Zabih, "Fast approximate energy minimization via graph cuts", IEEE Trans. on Pattern Analysis and Machine Intelligence, 23(11):1222-1239, 2001. ([pdf] )
  3. J. Sun, N.N. Zheng, and H.Y. Shum, "Stereo matching using belief propagation", IEEE Trans. on Pattern Analysis and Machine Intelligence, 25(7):787-800, 2003. ([pdf] )
  4. M. Levoy and P. Hanrahan, "Light field rendering", Proc. SIGGRAPH 96, 1996. ([pdf]) [Levoy's web page]

  5. M. Lang, et al., "Nonlinear disparity mapping for stereoscopic 3D", ACM Transactions on Graphics, 29(4), 2010. ([pdf] )

Image and Video Quality Assessment

  1. Z. Wang, et al., "Image quality assessment: From error visibility to structural similarity", IEEE Trans. on Image Processing, 13(4):600-612, 2004. ([pdf] )
  2. T.K. Tan, et al., "Video quality evaluation methodology and verification testing of HEVC compression performance", IEEE Trans. on Circuits and Systems for Video Technology, 26(1):76-90, 2016. ([pdf])
  3. L. Goldmann and T. Ebrahimi, "Towards reliable and reproducible 3D video quality assessment", Proc. of SPIE, Vol. 8043, Three-Dimensional Imaging, Visualization, and Display, 2011. ([pdf] )
  4. M. Lambooij, et al., "Visual discomfort and visual fatigue of stereoscopic displays: A review", Journal of Image Science and Technology, 53(3):030201, pp. 1-14, 2009. ([pdf] )
  5. A.K. Moorthy and A.C. Bovik, "Visual quality assessment algorithms: What does the future hold?", Int. Journal of Multimedia Tools and Application, 51(2):675-696, 2011. ([pdf] )
  6. A. Mittal, A.K. Moothy and A.C. Bovik, "Visually lossless H.264 compression of natural videos", The Computer Journal, 2012. ([pdf] )
  7. K. Seshadrinathan and A.C. Bovik, "Automatic prediction of perceptual quality of multimedia signals -- a survey", Int. Journal of Multimedia Tools and Application, Vol. 51, pp. 163-186, 2011. ([pdf] )

Content Based Image and Video Retrieval

  1. R. Arandjelovic and A. Zisserman, "Three things everyone should know to improve object retrieval", Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR2012), 2012. ([pdf])
  2. J.W.H. Tangelder and R.C. Veltkamp, "A survey of content based 3D shape retrieval methods", Multimedia Tools and Applications, 39:441-471, 2008. ([pdf])

Visual Content Analysis

  1. J. Malik, "Deep visual understanding from deep learning", A Keynote Presentation, 2017. ([video])
  2. S. Ren, et al., "Faster R-CNN: Towards real-time object detection with region proposal networks", IEEE Trans. on Pattern Analysis and Machine Intelligence, 39(6):1137-1149, 2017. ([pdf])
  3. J. Choi, et al., "Evento 360: Social event discovery from web-scale multimedia collection", In Proc. ACM Multimedia 2015, pp.193-196, 2015 (Grand Challenge Winner). ([pdf])
  4. T. Dean, et al. "Fast, accurate detection of 100,000 object classes on a single machine", Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR2013), 2013. ([pdf] )
  5. F. Schroff, A. Criminisi and A. Zisserman, "Harvesting image databases from the web", IEEE Trans. on Pattern Analysis and Machine Intelligence, 33(4):754-766, 2011. ([pdf])
  6. D.G. Lowe, "Distinctive image features from scale-invariant keypoints", Int. Journal of Computer Vision, 60(2):91-110, 2004. ([pdf])

Digital Audio Compression

  1. J.D. Johnston, S.R. Quackenbush, J. Herre and B. Grill, "Review of {MPEG-4} General Audio Coding", in Multimedia Systems, Standards, and Networks, A. Puri and T. Chen (eds.), New York: Marcel Dekker, pp. 131-155, 2000. ([pdf])
  2. S. Quackenbush and A. Lindsay, "Overview of MPEG-7 audio", IEEE Transactions on Circuits and Systems for Video Technology, Special issue on MPEG-7, 11(6):725-729, 2001. ([pdf] )
   

Recommended Textbook:

  1. Z.N. Li, M.S. Drew, and J. Liu, "Fundamentals of Multimedia", 2nd ed., Springer Switzerland, 2014.
    (ISBN: 978-3-319-05289-2, students can download via www.lib.sfu.ca)

Reference Books:

  1. V. Bhaskaran and K. Konstantinides, "Image and Video Compression Standards, Algorithms and Architectures", 2nd ed., Kluwer Academic Publisher, 1997.
    (In Library Reserve, TA 1632 B49 1997)
  2. Y. Wang, J. Ostermann and Y.Q. Zhang, "Video Processing and Communications", Prentice Hall, 2002.
    (In library Reserve, TK 5105.2 W36 2002)
  3. I.E. Richardson, "The H.264 Advanced Video Compression Standard", John Wiley & Sons, 2010.
    (In Library Reserve, TK 6680.5 R52 2010)
  4. R. Szeliski, "Computer Vision: Algorithms and Applications", Springer, 2010. Pre-publication versions on line: http://szeliski.org/Book/
  5. I.E. Richardson, "Video Codec Design: Developing Image and Video Compression Systems", John Wiley Sons, 2002. (ISBN: 0471485535)
  6. D.S. Taubman, and M.W. Marcellin, "JPEG2000: Image Compression Fundamentals, Standards, and Practice", Kluwer Academic Pulishers, 2002.
  7. F. Pereira and T. Ebrahimi, "The MPEG-4 Book", Prentice Hall, 2002. (ISBN: 0-13-061621-4)
  8. B.S. Manjunath, et al., "Introduction to MPEG-7: Multimedia Content Description Interface", Wiley, 2002. (ISBN-13: 978-0-471-48678-7)
  9. E. Trucco and A. Verri, "Introductory Techniques for 3-D Computer Vision", Prentice-Hall, 1998.
  10. D.A. Forsyth and J. Ponce, "Computer Vision: A Modern Approach", 2nd ed., Prentice Hall, 2011. (ISBN-13: 9780136085928)