CMPT 820   Multimedia Systems
Spring 2019

Reading List

You will be choosing papers from this list for your contributions to our seminar presentations.
(You may also introduce different papers to discuss, with approval.)


    A. Introduction to Digital Image and Video Compression

  1. G.K. Wallace, "The JPEG still picture compression standard", Communications of the ACM, 34(4):30-44, 1991. ([pdf])
  2. J.F. Blinn, "What's the deal with the DCT?", IEEE Computer Graphics and Applications, 13(4):78-83, 1993. ([pdf])
  3. J.F. Blinn, "NTSC: Nice Technology, Super Color", IEEE Computer Graphics and Applications, 13(2):17-23, 1993. ([pdf])
  4. R. Schafer and T. Sikora, "Digital video coding standards and their role in video communications", Proceedings of the IEEE, 83(6):907-924, 1995. ([pdf])
  5. B. MPEG-4/H.264, H.265 and MPEG-7

  6. T. Sikora, "The MPEG-4 video standard verification model", IEEE Trans. on Circuits and Systems for Video Technology, 7(1):19-31, 1997. ([pdf])
  7. T. Wiegand, G.J. Sullivan, G. Bjontegaard, and A. Luthra, "Overview of the H.264/AVC video coding standard", IEEE Trans. on Circuits and Systems for Video Technology, 13(7):560-576, 2003. ([pdf])
  8. G.J. Sullivan, et al., "Overview of the High Efficiency Video Coding (HEVC) Standard", IEEE Trans. on Circuits and Systems for Video Technology, 22(12):1649-1668, 2012. ([pdf])
  9. T. Sikora, "The MPEG-7 visual standard for content description-an overview", IEEE Trans. on Circuits and Systems for Video Technology, 11(6):696-702, 2001. ([pdf])
  10. C. Wavelets and JPEG-2000

  11. S. Mallat, "A theory for multiresolution signal decomposition: the wavelet representation", IEEE Trans. on Pattern Analysis and Machine Intelligence, 11(7):674-693, 1989. ([pdf])
  12. M. Antonini, et al., "Image coding using wavelet transform", IEEE Trans. on Image Processing, 1(2):205-221, 1992. ([pdf] )
  13. J.M. Shapiro, "Embedded image coding using zerotrees of wavelet coefficients", IEEE Trans. on Signal Processing, 41(12):3445-3462, 1993. ([pdf])
  14. C. Christopoulos, A. Skodras, and T. Ebrahimi, "The JPEG2000 still image coding system: An overview", IEEE Trans. on Consumer Electronics, 46(4):1103-1127, 2000. ([pdf] )

    D. 3D Video and TV

  15. D. Scharstein and R. Szeliski. "A taxonomy and evaluation of dense two-frame stereo correspondence algorithms," Int. Journal of Computer Vision, Vol. 47, pp. 7-42, 2002. ([pdf]) [Web site with data and source code]

  16. Y. Boykov, O. Veksler, and R. Zabih, "Fast approximate energy minimization via graph cuts", IEEE Trans. on Pattern Analysis and Machine Intelligence, 23(11):1222-1239, 2001. ([pdf] )
  17. J. Sun, N.N. Zheng, and H.Y. Shum, "Stereo matching using belief propagation", IEEE Trans. on Pattern Analysis and Machine Intelligence, 25(7):787-800, 2003. ([pdf] )
  18. L. Zhang, C. Vazquez and S. Knorr, "3D-TV content creation: Automatic 2D-to-3D video conversion", IEEE Transactions on Broadcasting, 57(2):372-383, 2011. ([pdf] )
  19. C. Kim, et al., "Multi-perspective stereoscopy from light fields", ACM Trans. on Graphics, 30(6), Article 190, 2011. ([pdf] )
  20. S. Wanner and B. Goldluecke, "Globally consistent depth labeling of 4D light fields", Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR2012), 2012. ([pdf] )
  21. E. Image and Video Quality Assessment

  22. Z. Wang and A.C. Bovik, "Mean squared error: Love it or leave it?", IEEE Signal Processing Magazine, 26(1):98-117, 2009. ([pdf] )
  23. S. Chikkerur, et al., "Objective video quality assessment methods: A classification, review, and performance comparison", IEEE Trans. on Broadcasting, 57(2):165-182, 2011. ([pdf] )
  24. S.J. Daly, R.T. Held and D.M. Hoffman, "Perceptual issues in stereoscopic signal processing", IEEE Trans. on Broadcasting, 57(2):347-361, 2011. ([pdf] )
  25. F. Visual Content Analysis and Retrieval

  26. J. Shotton, et al., "Efficient human pose estimation from single depth images", IEEE Trans. on Pattern Analysis and Machine Intelligence, 35(12):2821-2840, 2013 ([pdf]). Short and original CVPR2011 paper ([pdf]). ([Supplementary Materials]) ([ppt])
  27. P. Huang, A. Hilton and J. Starck, "Shape similarity for 3D video sequences of people", Int. Journal of Computer Vision, 89(2-3):362-381, 2010. ([pdf] )
  28. A. Farhadi and M.A. Sadeghi, "Phrasal recognition", IEEE Trans. on Pattern Analysis and Machine Intelligence, 35(12):2854-2865, 2013. ([pdf])
  29. R. Girshick, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation", Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR2014), 2014. ([pdf] )
  30. M. Malinowsky, M. Rohrbach, and M. Fritz, "Ask your neuron: A neural-based approach to answering questions about images", Proc. Int. Conf. on Computer Vision (ICCV 2015), 2015. ([pdf] )
  31. S. Goferman, L. Zelnik-Manor and A. Tal, "Context-aware saliency detection", IEEE Trans. on Pattern Analysis and Machine Intelligence, 34(10):1915-1926, 2012. ([pdf])
  32. L. Zhang, H. Dong, and A. El Saddik, "From 3D sensing to printing: A survey", ACM Trans. on Multimedia Computing, Communications, and Applications, 12(2):Article 27, pp. 1-22, 2015. ([pdf])
  33. G. Digital Audio Compression

  34. D. Pan, "A tutorial on MPEG/audio compression", IEEE Multimedia, 2(2):60-74, 1995. ([pdf])
  35. S. Shlien, "Guide to MPEG-1 audio standard", IEEE Trans. on Broadcasting, 40(4):206-218, 1994. ([pdf])
   

Reference Papers:

Digital Image and Video

  1. J.F. Blinn, "The world of digital video", IEEE Computer Graphics and Applications, 12(5):106-112, 1992. ([pdf])

MPEG and H.264

  1. D. Le Gall, "MPEG: a video compression standard for multimedia applications", Communications of the ACM, 34(4):46-58, 1991. ([pdf])
  2. G.J. Sullivan and T. Wiegand, "Video compression -- From concepts to the H.264/AVC standard", Proceedings of the IEEE, 93(1):18-31, 2005. ([pdf])
  3. D. Marpe, H. Schwarz, and T. Wiegand, "Context-based Adaptive Binary Arithmetic Coding in the H.264/AVC video compression standard", IEEE Trans. on Circuits and Systems for Video Technology, 13(7):620-636, 2003. ([pdf])
  4. V. Sze and M. Budagavi, "High Throughput CABAC Entropy Coding in HEVC", IEEE Trans. on Circuits and Systems for Video Technology, 22(12):1778-1791, 2012. ([pdf])
  5. H.S. Malvar, et al., "Low-complexity transform and quantization in H.264/AVC", IEEE Trans. on Circuits and Systems for Video Technology, 13(7):598-603, 2003. ([pdf])
  6. P. List, et al., "Adaptive deblocking filter", IEEE Trans. on Circuits and Systems for Video Technology, 13(7):614-619, 2003. ([pdf])
  7. T. Takishima, M. Wada and H. Murakami, "Reversible variable length codes", IEEE Trans. on Communications, 43(2/3/4):158-162,1995. ([pdf])
  8. J. Ostermann, E.S. Jang, J. Shin, and T. Chen, "Coding of arbitrarily shaped video objects in MPEG-4", In Proc. Int. Conf. on Image Processing (ICIP '97), 496-499, 1997. ([pdf])

Wavelets and JPEG-2000

  1. P.J. Burt, "The Laplacian pyramid as a compact image code", IEEE Trans. on Communications, COM-31(4):532-540, 1983. ([pdf])
  2. D. Taubman, "High performance scalable image compression with EBCOT", IEEE Trans. on Image Processing, 9(7):1158-1170, 2000. ([pdf])
  3. B.E. Usevitch, "A tutorial on modern lossy wavelet image compression: Foundations of JPEG2000", IEEE Signal Processing Magazine, 18(5), 22-35, 2001. ([pdf])

  4. O. Rioul and M. Vetterli, "Wavelets and signal processing", IEEE Signal Processing Magazine, 8(4):14-38, Oct. 1991. ([pdf])
  5. A A. Said and W.A. Pearlman, "A new, fast, and efficient image codec based on set partitioning in hierarchical trees", IEEE Trans. on Circuits and Systems for Video Technology, 6(3):243-250, 1996. ([pdf])
  6. D. Taubman, "Embedded block coding in JPEG2000", Proc. Int. Conf. on Image Processing (ICIP '2000), Vol. II, 33-36, 2000. ([pdf])
  7. K. Varma and A. Bell, "JPEG2000 -- Choices and tradeoffs for encoders", IEEE Signal Processing, 21(6):70-75, 2004. ([pdf] )
  8. D. Santa Cruz, and T. Ebrahimi, "A Study of JPEG 2000 Still Image Coding Versus Other Standards", ISO/IEC JTC1/SC29/WG1 (ITU-T SG8), 2000.

3D Video and TV

  1. M. Levoy and P. Hanrahan, "Light field rendering", Proc. SIGGRAPH 96, 1996. ([pdf]) [Levoy's web page]

  2. C.L. Zitnick, et al., "High-quality video view interpolation using a layered representation", ACM Transactions on Graphics, 23(3):600-608, 2004. ([pdf] )
  3. M. Lang, et al., "Nonlinear disparity mapping for stereoscopic 3D", ACM Transactions on Graphics, 29(4), 2010. ([pdf] )
  4. L. Shapira, et al. "Contextual part analogies in 3D objects", Int. Journal of Computer Vision, 89(2-3):309-326, 2010. ([pdf] )
  5. N.S. Holiman, et al., "Three-dimensional displays: A review and applications analysis", IEEE Transactions on Broadcasting, 57(2):362-371, 2011. ([pdf] )
  6. A. Kubota, et al., "Multiview imaging and 3DTV", IEEE Signal Processing, 10(5):10-21, 2007. ([pdf] )
  7. P. Felzenszwalb and D. Huttenlocher, "Efficient belief propagation for early vision", Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR2004), 2004. ([pdf] )
  8. J. Starck, J. Kilner, and A. Hilton, "A free-viewpoint video renderer", Journal of Graphics, GPU and Game Tools, 14(3):57-72, 2009. ([pdf] )
  9. Z.N. Li and G. Hu, "Analysis of disparity gradient based cooperative stereo", IEEE Trans. on Image Processing, 5(11):1493-1506, 1996. ([pdf] )

Image and Video Quality Assessment

  1. Z. Wang, et al., "Image quality assessment: From error visibility to structural similarity", IEEE Trans. on Image Processing, 13(4):600-612, 2004. ([pdf] )
  2. T.K. Tan, et al., "Video quality evaluation methodology and verification testing of HEVC compression performance", IEEE Trans. on Circuits and Systems for Video Technology, 26(1):76-90, 2016. ([pdf])
  3. M.H. Pinson and S. Wolf, "A new standardized method for objectively measuring video quality", IEEE Trans. on Broadcasting, 50(3):312-322, 2004. ([pdf] )
  4. D.M. Chandler and S.S. Hemami, "VSNR: A wavelet-based visual signal-to-noise ratio for natural images", IEEE Trans. on Image Processing, 16(9):2284-2298, 2007. ([pdf] )
  5. L. Goldmann and T. Ebrahimi, "Towards reliable and reproducible 3D video quality assessment", Proc. of SPIE, Vol. 8043, Three-Dimensional Imaging, Visualization, and Display, 2011. ([pdf] )
  6. M. Lambooij, et al., "Visual discomfort and visual fatigue of stereoscopic displays: A review", Journal of Image Science and Technology, 53(3):030201, pp. 1-14, 2009. ([pdf] )
  7. A.K. Moorthy and A.C. Bovik, "Visual quality assessment algorithms: What does the future hold?", Int. Journal of Multimedia Tools and Application, 51(2):675-696, 2011. ([pdf] )
  8. A. Mittal, A.K. Moothy and A.C. Bovik, "Visually lossless H.264 compression of natural videos", The Computer Journal, 2012. ([pdf] )
  9. A. Rehman and Z. Wang, "Reduced-reference image quality assessment by structured similarity estimation", IEEE Trans. on Image Processing, 21(8):3378-3389, 2012. ([pdf] )
  10. K. Seshadrinathan and A.C. Bovik, "Automatic prediction of perceptual quality of multimedia signals -- a survey", Int. Journal of Multimedia Tools and Application, Vol. 51, pp. 163-186, 2011. ([pdf] )
  11. Z. Wang and A.C. Bovik, "Reduced- and no-reference image quality assessment", IEEE Signal Processing Magazine, Special Issue on Multimedia Quality Assessment, 29(6):29-40, Nov. 2011. ([pdf] )

Visual Content Analysis and Retrieval

  1. J. Choi, et al., "Evento 360: Social event discovery from web-scale multimedia collection", In Proc. ACM Multimedia 2015, pp.193-196, 2015 (Grand Challenge Winner). ([pdf])
  2. T. Dean, et al. "Fast, accurate detection of 100,000 object classes on a single machine", Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR2013), 2013. ([pdf] )
  3. G. Wang, D. Hoiem and D. Forsyth, "Learning image similarity from Flickr groups using fast kernel machines", IEEE Trans. on Pattern Analysis and Machine Intelligence, 34(11):2177-2188, 2012. ([pdf])
  4. A.W.M. Smeulders, et al., "Content-based image retrieval at the end of the early years", IEEE Trans. on Pattern Analysis and Machine Intelligence, 22(12):1349-1380, 2000. ([pdf])
  5. R. Arandjelovic and A. Zisserman, "Three things everyone should know to improve object retrieval", Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR2012), 2012. ([pdf])
  6. F. Schroff, A. Criminisi and A. Zisserman, "Harvesting image databases from the web", IEEE Trans. on Pattern Analysis and Machine Intelligence, 33(4):754-766, 2011. ([pdf])
  7. P. Papadakis, et al. "Panorama: A 3D shape descriptor based on panoramic views for unsupervised 3D object retrieval", Int. Journal of Computer Vision, 89(2-3):177-192, 2010. ([pdf] )
  8. J.W.H. Tangelder and R.C. Veltkamp, "A survey of content based 3D shape retrieval methods", Multimedia Tools and Applications, 39:441-471, 2008. ([pdf])
  9. M.J. Swain and D.H. Ballard, "Color indexing", Int. Journal of Computer Vision, 7(1):11-32, 1991. ([pdf])
  10. D.G. Lowe, "Distinctive image features from scale-invariant keypoints", Int. Journal of Computer Vision, 60(2):91-110, 2004. ([pdf])
  11. M.S. Drew, Z.N. Li and Z. Tauber, "Illumination color covariant locale-based visual object retrieval", Pattern Recognition, Special Issue on Color Machine Vision, 35(8):1687-1704, 2002. [pdf]
  12. C. Carson, S. Belongie, H. Greenspan, and J. Malik, "Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying", IEEE Trans. on Pattern Analysis and Machine Intelligence, 24(8):1026-1038, 2002. ([pdf])

Digital Audio Compression

  1. J.D. Johnston, S.R. Quackenbush, J. Herre and B. Grill, "Review of {MPEG-4} General Audio Coding", in Multimedia Systems, Standards, and Networks, A. Puri and T. Chen (eds.), New York: Marcel Dekker, pp. 131-155, 2000. ([pdf])
  2. S. Quackenbush and A. Lindsay, "Overview of MPEG-7 audio", IEEE Transactions on Circuits and Systems for Video Technology, Special issue on MPEG-7, 11(6):725-729, 2001. ([pdf] )
   

Recommended Textbook:

  1. Z.N. Li, M.S. Drew, and J. Liu, "Fundamentals of Multimedia", 2nd ed., Springer Switzerland, 2014.
    (ISBN: 978-3-319-05289-2, students can download via www.lib.sfu.ca)

Reference Books:

  1. V. Bhaskaran and K. Konstantinides, "Image and Video Compression Standards, Algorithms and Architectures", 2nd ed., Kluwer Academic Publisher, 1997.
    (In Library Reserve, TA 1632 B49 1997)
  2. Y. Wang, J. Ostermann and Y.Q. Zhang, "Video Processing and Communications", Prentice Hall, 2002.
    (In library Reserve, TK 5105.2 W36 2002)
  3. I.E. Richardson, "The H.264 Advanced Video Compression Standard", John Wiley & Sons, 2010.
    (In Library Reserve, TK 6680.5 R52 2010)
  4. R. Szeliski, "Computer Vision: Algorithms and Applications", Springer, 2010. Pre-publication versions on line: http://szeliski.org/Book/
  5. I.E. Richardson, "Video Codec Design: Developing Image and Video Compression Systems", John Wiley Sons, 2002. (ISBN: 0471485535)
  6. D.S. Taubman, and M.W. Marcellin, "JPEG2000: Image Compression Fundamentals, Standards, and Practice", Kluwer Academic Pulishers, 2002.
  7. F. Pereira and T. Ebrahimi, "The MPEG-4 Book", Prentice Hall, 2002. (ISBN: 0-13-061621-4)
  8. B.S. Manjunath, et al., "Introduction to MPEG-7: Multimedia Content Description Interface", Wiley, 2002. (ISBN-13: 978-0-471-48678-7)
  9. E. Trucco and A. Verri, "Introductory Techniques for 3-D Computer Vision", Prentice-Hall, 1998.
  10. D.A. Forsyth and J. Ponce, "Computer Vision: A Modern Approach", 2nd ed., Prentice Hall, 2011. (ISBN-13: 9780136085928)