现在的位置: 首页 > 程序设计> 正文
搭建基于Python的视频服务器
2013年08月24日 程序设计 暂无评论 ⁄ 被围观 4,040+

本文基于Python后台搭建了一个视频服务器,集视频的上传,视频的格式转化,视频的播放于一体,后台基于Django框架和Amazon S3的存储,视频格式的转换基于Encoding.com的在线服务,消息队列基于RabbitMQ,视频上传和转换处理完毕后,采用浏览器的Html5播放,采用了Video.js。

Stickyworld's consultation web app has supported video for a long time but it's been hosted via a YouTube embed. When we started building the new version of the web app we wanted to take control of the video content and also free our users from YouTube's terms of service.

I personally had worked on projects with clients in the past which did video transcoding and it was never something easy to achieve. It takes a lot to accept every video, audio and container format under the sun and output them into various video formats that the web knows and loves.

With that in mind we decided the conversion process would be handled byEncoding.com. They let you encode your first Gigabyte of video with them for free and then have a tiered pricing system there after.

Throughout the development of the code below I would upload a two-second, 178KB video to test that everything was working. When the exceptions stopped being raised, I tested larger and more exotic files.

Stage 1: The User uploads a video

At the moment the new codebase just has a quick-and-dirty HTML5-based uploading mechanism. This is the CoffeeScript for uploading from the client to the server:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
$scope.upload_slide = (upload_slide_form) ->
    file = document.getElementById("slide_file").files[0]
    reader = new FileReader()
    reader.readAsDataURL file
    reader.onload = (event) ->
      result = event.target.result
      fileName = document.getElementById("slide_file").files[0].name
      $.post "/world/upload_slide",
        data: result
        name: fileName
        room_id: $scope.room.id
        (response_data) ->
          if response_data.success? is not yes
            console.error "There was an error uploading the file", response_data
          else
            console.log "Upload successful", response_data
    reader.onloadstart = ->
      console.log "onloadstart"
    reader.onprogress = (event) ->
      console.log "onprogress", event.total, event.loaded, (event.loaded / event.total) * 100
    reader.onabort = ->
      console.log "onabort"
    reader.onerror = ->
      console.log "onerror"
    reader.onloadend = (event) ->
      console.log "onloadend", event
$scope.upload_slide = (upload_slide_form) ->
    file = document.getElementById("slide_file").files[0]
    reader = new FileReader()
    reader.readAsDataURL file
    reader.onload = (event) ->
      result = event.target.result
      fileName = document.getElementById("slide_file").files[0].name
      $.post "/world/upload_slide",
        data: result
        name: fileName
        room_id: $scope.room.id
        (response_data) ->
          if response_data.success? is not yes
            console.error "There was an error uploading the file", response_data
          else
            console.log "Upload successful", response_data
    reader.onloadstart = ->
      console.log "onloadstart"
    reader.onprogress = (event) ->
      console.log "onprogress", event.total, event.loaded, (event.loaded / event.total) * 100
    reader.onabort = ->
      console.log "onabort"
    reader.onerror = ->
      console.log "onerror"
    reader.onloadend = (event) ->
      console.log "onloadend", event

It would be nice to loop through ("slide_file").files and upload each file individually instead of just the first file. This is somewhere we'll be addressing soon.

Stage 2: Validate and upload to S3

On the backend we're running Django and RabbitMQ. The key modules we're using are:

1
2
$ pip install 'Django>=1.5.2' 'django-celery>=3.0.21' \
    'django-storages>=1.1.8' 'lxml>=3.2.3' 'python-magic>=0.4.3'
$ pip install 'Django>=1.5.2' 'django-celery>=3.0.21' \
    'django-storages>=1.1.8' 'lxml>=3.2.3' 'python-magic>=0.4.3'

I created two models: SlideUploadQueue to store references to all the uploads and SlideVideoMedia to store all the references to the processed videos.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
class SlideUploadQueue(models.Model):
    created_by = models.ForeignKey(User)
    created_time = models.DateTimeField(db_index=True)
    original_file = models.FileField(
        upload_to=filename_sanitiser, blank=True, default='')
    media_type = models.ForeignKey(MediaType)
    encoding_com_tracking_code = models.CharField(
        default='', max_length=24, blank=True)
 
    STATUS_AWAITING_DATA = 0
    STATUS_AWAITING_PROCESSING = 1
    STATUS_PROCESSING = 2
    STATUS_AWAITING_3RD_PARTY_PROCESSING = 5
    STATUS_FINISHED = 3
    STATUS_FAILED = 4
 
    STATUS_LIST = (
        (STATUS_AWAITING_DATA, 'Awaiting Data'),
        (STATUS_AWAITING_PROCESSING, 'Awaiting processing'),
        (STATUS_PROCESSING, 'Processing'),
        (STATUS_AWAITING_3RD_PARTY_PROCESSING,
            'Awaiting 3rd-party processing'),
        (STATUS_FINISHED, 'Finished'),
        (STATUS_FAILED, 'Failed'),
    )
 
    status = models.PositiveSmallIntegerField(
        default=STATUS_AWAITING_DATA, choices=STATUS_LIST)
 
    class Meta:
        verbose_name = 'Slide'
        verbose_name_plural = 'Slide upload queue'
 
    def save(self, *args, **kwargs):
        if not self.created_time:
            self.created_time = \
                datetime.utcnow().replace(tzinfo=pytz.utc)
 
        return super(SlideUploadQueue, self).save(*args, **kwargs)
 
    def __unicode__(self):
        if self.id is None:
            return 'new <SlideUploadQueue>'
        return '<SlideUploadQueue> %d' % self.id
 
class SlideVideoMedia(models.Model):
    converted_file = models.FileField(
        upload_to=filename_sanitiser, blank=True, default='')
 
    FORMAT_MP4 = 0
    FORMAT_WEBM = 1
    FORMAT_OGG = 2
    FORMAT_FL9 = 3
    FORMAT_THUMB = 4
 
    supported_formats = (
        (FORMAT_MP4, 'MPEG 4'),
        (FORMAT_WEBM, 'WebM'),
        (FORMAT_OGG, 'OGG'),
        (FORMAT_FL9, 'Flash 9 Video'),
        (FORMAT_THUMB, 'Thumbnail'),
    )
 
    mime_types = (
        (FORMAT_MP4, 'video/mp4'),
        (FORMAT_WEBM, 'video/webm'),
        (FORMAT_OGG, 'video/ogg'),
        (FORMAT_FL9, 'video/mp4'),
        (FORMAT_THUMB, 'image/jpeg'),
    )
 
    format = models.PositiveSmallIntegerField(
        default=FORMAT_MP4, choices=supported_formats)
 
    class Meta:
        verbose_name = 'Slide video'
        verbose_name_plural = 'Slide videos'
 
    def __unicode__(self):
        if self.id is None:
            return 'new <SlideVideoMedia>'
        return '<SlideVideoMedia> %d' % self.id
class SlideUploadQueue(models.Model):
    created_by = models.ForeignKey(User)
    created_time = models.DateTimeField(db_index=True)
    original_file = models.FileField(
        upload_to=filename_sanitiser, blank=True, default='')
    media_type = models.ForeignKey(MediaType)
    encoding_com_tracking_code = models.CharField(
        default='', max_length=24, blank=True)

    STATUS_AWAITING_DATA = 0
    STATUS_AWAITING_PROCESSING = 1
    STATUS_PROCESSING = 2
    STATUS_AWAITING_3RD_PARTY_PROCESSING = 5
    STATUS_FINISHED = 3
    STATUS_FAILED = 4

    STATUS_LIST = (
        (STATUS_AWAITING_DATA, 'Awaiting Data'),
        (STATUS_AWAITING_PROCESSING, 'Awaiting processing'),
        (STATUS_PROCESSING, 'Processing'),
        (STATUS_AWAITING_3RD_PARTY_PROCESSING,
            'Awaiting 3rd-party processing'),
        (STATUS_FINISHED, 'Finished'),
        (STATUS_FAILED, 'Failed'),
    )

    status = models.PositiveSmallIntegerField(
        default=STATUS_AWAITING_DATA, choices=STATUS_LIST)

    class Meta:
        verbose_name = 'Slide'
        verbose_name_plural = 'Slide upload queue'

    def save(self, *args, **kwargs):
        if not self.created_time:
            self.created_time = \
                datetime.utcnow().replace(tzinfo=pytz.utc)

        return super(SlideUploadQueue, self).save(*args, **kwargs)

    def __unicode__(self):
        if self.id is None:
            return 'new <SlideUploadQueue>'
        return '<SlideUploadQueue> %d' % self.id

class SlideVideoMedia(models.Model):
    converted_file = models.FileField(
        upload_to=filename_sanitiser, blank=True, default='')

    FORMAT_MP4 = 0
    FORMAT_WEBM = 1
    FORMAT_OGG = 2
    FORMAT_FL9 = 3
    FORMAT_THUMB = 4

    supported_formats = (
        (FORMAT_MP4, 'MPEG 4'),
        (FORMAT_WEBM, 'WebM'),
        (FORMAT_OGG, 'OGG'),
        (FORMAT_FL9, 'Flash 9 Video'),
        (FORMAT_THUMB, 'Thumbnail'),
    )

    mime_types = (
        (FORMAT_MP4, 'video/mp4'),
        (FORMAT_WEBM, 'video/webm'),
        (FORMAT_OGG, 'video/ogg'),
        (FORMAT_FL9, 'video/mp4'),
        (FORMAT_THUMB, 'image/jpeg'),
    )

    format = models.PositiveSmallIntegerField(
        default=FORMAT_MP4, choices=supported_formats)

    class Meta:
        verbose_name = 'Slide video'
        verbose_name_plural = 'Slide videos'

    def __unicode__(self):
        if self.id is None:
            return 'new <SlideVideoMedia>'
        return '<SlideVideoMedia> %d' % self.id

Our models use a filename_sanitiser method in each models.FileField field to automatically adjust filenames into a <model>/<uuid4>.<extention>format. This sanitises each filename and makes sure they're unique. To add to that, we used signed URLs that expire so we can control who is served our content and for how long.

1
2
3
4
5
6
7
8
9
10
11
def filename_sanitiser(instance, filename):
    folder = instance.__class__.__name__.lower()
    ext = 'jpg'
 
    if '.' in filename:
        t_ext = filename.split('.')[-1].strip().lower()
 
        if t_ext != '':
            ext = t_ext
 
    return '%s/%s.%s' % (folder, str(uuid.uuid4()), ext)
def filename_sanitiser(instance, filename):
    folder = instance.__class__.__name__.lower()
    ext = 'jpg'

    if '.' in filename:
        t_ext = filename.split('.')[-1].strip().lower()

        if t_ext != '':
            ext = t_ext

    return '%s/%s.%s' % (folder, str(uuid.uuid4()), ext)

A file uploaded as testing.mov would turn into https://our-bucket.s3.amazonaws.com/slideuploadqueue/3fe27193-e87f-4244-9aa2-66409f70ebd3.mov and would be uploaded by Django Storages.

In our backend endpoint where video is uploaded to from the browser we validate the uploaded content using Magic. This detects what kind of file it is based on it's contents:

1
2
3
4
5
6
@verify_auth_token
@return_json
def upload_slide(request):
    file_data = request.POST.get('data', '')
    file_data = base64.b64decode(file_data.split(';base64,')[1])
    description = magic.from_buffer(file_data)
@verify_auth_token
@return_json
def upload_slide(request):
    file_data = request.POST.get('data', '')
    file_data = base64.b64decode(file_data.split(';base64,')[1])
    description = magic.from_buffer(file_data)

So if description matches something like MPEG v4 system or Apple QuickTime movie then we know it's suitable for transcoding. If it isn't something like the above, we can flag it up with the user.

Next, we'll save the video into a our SlideUploadQueue model and send a job off to RabbitMQ. Because we're using Django Storages, it'll be uploaded to Amazon S3 automatically.

1
2
3
4
5
6
7
8
9
10
slide_upload = SlideUploadQueue()
...
slide_upload.status = SlideUploadQueue.STATUS_AWAITING_PROCESSING
slide_upload.save()
slide_upload.original_file.\
    save('anything.%s' % file_ext, ContentFile(file_data))
slide_upload.save()
 
task = ConvertRawSlideToSlide()
task.delay(slide_upload)
slide_upload = SlideUploadQueue()
...
slide_upload.status = SlideUploadQueue.STATUS_AWAITING_PROCESSING
slide_upload.save()
slide_upload.original_file.\
    save('anything.%s' % file_ext, ContentFile(file_data))
slide_upload.save()

task = ConvertRawSlideToSlide()
task.delay(slide_upload)

Stage 3: Send the video to a 3rd party

RabbitMQ will take over and handle the task.delay(slide_upload) call.

Here all we're doing is sending Encoding.com a URL of our video and instructions on what output formats we want. They'll give us a job number in return which we'll use to check up on the progress on the transcoding at a later point.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
class ConvertRawSlideToSlide(Task):
    queue = 'backend_convert_raw_slides'
    ...
    def _handle_video(self, slide_upload):
        mp4 = {
            'output': 'mp4',
            'size': '320x240',
            'bitrate': '256k',
            'audio_bitrate': '64k',
            'audio_channels_number': '2',
            'keep_aspect_ratio': 'yes',
            'video_codec': 'mpeg4',
            'profile': 'main',
            'vcodecparameters': 'no',
            'audio_codec': 'libfaac',
            'two_pass': 'no',
            'cbr': 'no',
            'deinterlacing': 'no',
            'keyframe': '300',
            'audio_volume': '100',
            'file_extension': 'mp4',
            'hint': 'no',
        }
 
        webm = {
            'output': 'webm',
            'size': '320x240',
            'bitrate': '256k',
            'audio_bitrate': '64k',
            'audio_sample_rate': '44100',
            'audio_channels_number': '2',
            'keep_aspect_ratio': 'yes',
            'video_codec': 'libvpx',
            'profile': 'baseline',
            'vcodecparameters': 'no',
            'audio_codec': 'libvorbis',
            'two_pass': 'no',
            'cbr': 'no',
            'deinterlacing': 'no',
            'keyframe': '300',
            'audio_volume': '100',
            'preset': '6',
            'file_extension': 'webm',
            'acbr': 'no',
        }
 
        ogg = {
            'output': 'ogg',
            'size': '320x240',
            'bitrate': '256k',
            'audio_bitrate': '64k',
            'audio_sample_rate': '44100',
            'audio_channels_number': '2',
            'keep_aspect_ratio': 'yes',
            'video_codec': 'libtheora',
            'profile': 'baseline',
            'vcodecparameters': 'no',
            'audio_codec': 'libvorbis',
            'two_pass': 'no',
            'cbr': 'no',
            'deinterlacing': 'no',
            'keyframe': '300',
            'audio_volume': '100',
            'file_extension': 'ogg',
            'acbr': 'no',
        }
 
        flv = {
            'output': 'fl9',
            'size': '320x240',
            'bitrate': '256k',
            'audio_bitrate': '64k',
            'audio_channels_number': '2',
            'keep_aspect_ratio': 'yes',
            'video_codec': 'libx264',
            'profile': 'high',
            'vcodecparameters': 'no',
            'audio_codec': 'libfaac',
            'two_pass': 'no',
            'cbr': 'no',
            'deinterlacing': 'no',
            'keyframe': '300',
            'audio_volume': '100',
            'file_extension': 'mp4',
        }
 
        thumbnail = {
            'output': 'thumbnail',
            'time': '5',
            'video_codec': 'mjpeg',
            'keep_aspect_ratio': 'yes',
            'file_extension': 'jpg',
        }
 
        encoder = Encoding(settings.ENCODING_API_USER_ID,
            settings.ENCODING_API_USER_KEY)
        resp = encoder.add_media(source=[slide_upload.original_file.url],
            formats=[mp4, webm, ogg, flv, thumbnail])
 
        media_id = None
 
        if resp is not None and resp.get('response') is not None:
            media_id = resp.get('response').get('MediaID')
 
        if media_id is None:
            slide_upload.status = SlideUploadQueue.STATUS_FAILED
            slide_upload.save()
            log.error('Unable to communicate with encoding.com')
            return False
 
        slide_upload.encoding_com_tracking_code = media_id
        slide_upload.status = \
            SlideUploadQueue.STATUS_AWAITING_3RD_PARTY_PROCESSING
        slide_upload.save()
        return True
class ConvertRawSlideToSlide(Task):
    queue = 'backend_convert_raw_slides'
    ...
    def _handle_video(self, slide_upload):
        mp4 = {
            'output': 'mp4',
            'size': '320x240',
            'bitrate': '256k',
            'audio_bitrate': '64k',
            'audio_channels_number': '2',
            'keep_aspect_ratio': 'yes',
            'video_codec': 'mpeg4',
            'profile': 'main',
            'vcodecparameters': 'no',
            'audio_codec': 'libfaac',
            'two_pass': 'no',
            'cbr': 'no',
            'deinterlacing': 'no',
            'keyframe': '300',
            'audio_volume': '100',
            'file_extension': 'mp4',
            'hint': 'no',
        }

        webm = {
            'output': 'webm',
            'size': '320x240',
            'bitrate': '256k',
            'audio_bitrate': '64k',
            'audio_sample_rate': '44100',
            'audio_channels_number': '2',
            'keep_aspect_ratio': 'yes',
            'video_codec': 'libvpx',
            'profile': 'baseline',
            'vcodecparameters': 'no',
            'audio_codec': 'libvorbis',
            'two_pass': 'no',
            'cbr': 'no',
            'deinterlacing': 'no',
            'keyframe': '300',
            'audio_volume': '100',
            'preset': '6',
            'file_extension': 'webm',
            'acbr': 'no',
        }

        ogg = {
            'output': 'ogg',
            'size': '320x240',
            'bitrate': '256k',
            'audio_bitrate': '64k',
            'audio_sample_rate': '44100',
            'audio_channels_number': '2',
            'keep_aspect_ratio': 'yes',
            'video_codec': 'libtheora',
            'profile': 'baseline',
            'vcodecparameters': 'no',
            'audio_codec': 'libvorbis',
            'two_pass': 'no',
            'cbr': 'no',
            'deinterlacing': 'no',
            'keyframe': '300',
            'audio_volume': '100',
            'file_extension': 'ogg',
            'acbr': 'no',
        }

        flv = {
            'output': 'fl9',
            'size': '320x240',
            'bitrate': '256k',
            'audio_bitrate': '64k',
            'audio_channels_number': '2',
            'keep_aspect_ratio': 'yes',
            'video_codec': 'libx264',
            'profile': 'high',
            'vcodecparameters': 'no',
            'audio_codec': 'libfaac',
            'two_pass': 'no',
            'cbr': 'no',
            'deinterlacing': 'no',
            'keyframe': '300',
            'audio_volume': '100',
            'file_extension': 'mp4',
        }

        thumbnail = {
            'output': 'thumbnail',
            'time': '5',
            'video_codec': 'mjpeg',
            'keep_aspect_ratio': 'yes',
            'file_extension': 'jpg',
        }

        encoder = Encoding(settings.ENCODING_API_USER_ID,
            settings.ENCODING_API_USER_KEY)
        resp = encoder.add_media(source=[slide_upload.original_file.url],
            formats=[mp4, webm, ogg, flv, thumbnail])

        media_id = None

        if resp is not None and resp.get('response') is not None:
            media_id = resp.get('response').get('MediaID')

        if media_id is None:
            slide_upload.status = SlideUploadQueue.STATUS_FAILED
            slide_upload.save()
            log.error('Unable to communicate with encoding.com')
            return False

        slide_upload.encoding_com_tracking_code = media_id
        slide_upload.status = \
            SlideUploadQueue.STATUS_AWAITING_3RD_PARTY_PROCESSING
        slide_upload.save()
        return True

Encoding.com recommended some less-than-ideal python wrappers for communicating with their service. I added a few fixes into the module but there is still work to be done to get it to a state I'm happy with. Below is what this wrapper currently looks like in our codebase:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
import httplib
from lxml import etree
import urllib
from xml.parsers.expat import ExpatError
import xmltodict
 
ENCODING_API_URL = 'manage.encoding.com:80'
 
class Encoding(object):
 
    def __init__(self, userid, userkey, url=ENCODING_API_URL):
        self.url = url
        self.userid = userid
        self.userkey = userkey
 
    def get_media_info(self, action='GetMediaInfo', ids=[],
        headers={'Content-Type': 'application/x-www-form-urlencoded'}):
        query = etree.Element('query')
 
        nodes = {
            'userid': self.userid,
            'userkey': self.userkey,
            'action': action,
            'mediaid': ','.join(ids),
        }
 
        query = self._build_tree(etree.Element('query'), nodes)
        results = self._execute_request(query, headers)
 
        return self._parse_results(results)
 
    def get_status(self, action='GetStatus', ids=[], extended='no',
        headers={'Content-Type': 'application/x-www-form-urlencoded'}):
        query = etree.Element('query')
 
        nodes = {
            'userid': self.userid,
            'userkey': self.userkey,
            'action': action,
            'extended': extended,
            'mediaid': ','.join(ids),
        }
 
        query = self._build_tree(etree.Element('query'), nodes)
        results = self._execute_request(query, headers)
 
        return self._parse_results(results)
 
    def add_media(self, action='AddMedia', source=[], notify='', formats=[],
        instant='no',
        headers={'Content-Type': 'application/x-www-form-urlencoded'}):
        query = etree.Element('query')
 
        nodes = {
            'userid': self.userid,
            'userkey': self.userkey,
            'action': action,
            'source': source,
            'notify': notify,
            'instant': instant,
        }
 
        query = self._build_tree(etree.Element('query'), nodes)
 
        for format in formats:
            format_node = self._build_tree(etree.Element('format'), format)
            query.append(format_node)
 
        results = self._execute_request(query, headers)
        return self._parse_results(results)
 
    def _build_tree(self, node, data):
        for k, v in data.items():
            if isinstance(v, list):
                for item in v:
                    element = etree.Element(k)
                    element.text = item
                    node.append(element)
            else:
                element = etree.Element(k)
                element.text = v
                node.append(element)
 
        return node
 
    def _execute_request(self, xml, headers, path='', method='POST'):
        params = urllib.urlencode({'xml': etree.tostring(xml)})
 
        conn = httplib.HTTPConnection(self.url)
        conn.request(method, path, params, headers)
        response = conn.getresponse()
        data = response.read()
        conn.close()
        return data
 
    def _parse_results(self, results):
        try:
            return xmltodict.parse(results)
        except ExpatError, e:
            print 'Error parsing encoding.com response'
            print e
            return None
import httplib
from lxml import etree
import urllib
from xml.parsers.expat import ExpatError
import xmltodict

ENCODING_API_URL = 'manage.encoding.com:80'

class Encoding(object):

    def __init__(self, userid, userkey, url=ENCODING_API_URL):
        self.url = url
        self.userid = userid
        self.userkey = userkey

    def get_media_info(self, action='GetMediaInfo', ids=[],
        headers={'Content-Type': 'application/x-www-form-urlencoded'}):
        query = etree.Element('query')

        nodes = {
            'userid': self.userid,
            'userkey': self.userkey,
            'action': action,
            'mediaid': ','.join(ids),
        }

        query = self._build_tree(etree.Element('query'), nodes)
        results = self._execute_request(query, headers)

        return self._parse_results(results)

    def get_status(self, action='GetStatus', ids=[], extended='no',
        headers={'Content-Type': 'application/x-www-form-urlencoded'}):
        query = etree.Element('query')

        nodes = {
            'userid': self.userid,
            'userkey': self.userkey,
            'action': action,
            'extended': extended,
            'mediaid': ','.join(ids),
        }

        query = self._build_tree(etree.Element('query'), nodes)
        results = self._execute_request(query, headers)

        return self._parse_results(results)

    def add_media(self, action='AddMedia', source=[], notify='', formats=[],
        instant='no',
        headers={'Content-Type': 'application/x-www-form-urlencoded'}):
        query = etree.Element('query')

        nodes = {
            'userid': self.userid,
            'userkey': self.userkey,
            'action': action,
            'source': source,
            'notify': notify,
            'instant': instant,
        }

        query = self._build_tree(etree.Element('query'), nodes)

        for format in formats:
            format_node = self._build_tree(etree.Element('format'), format)
            query.append(format_node)

        results = self._execute_request(query, headers)
        return self._parse_results(results)

    def _build_tree(self, node, data):
        for k, v in data.items():
            if isinstance(v, list):
                for item in v:
                    element = etree.Element(k)
                    element.text = item
                    node.append(element)
            else:
                element = etree.Element(k)
                element.text = v
                node.append(element)

        return node

    def _execute_request(self, xml, headers, path='', method='POST'):
        params = urllib.urlencode({'xml': etree.tostring(xml)})

        conn = httplib.HTTPConnection(self.url)
        conn.request(method, path, params, headers)
        response = conn.getresponse()
        data = response.read()
        conn.close()
        return data

    def _parse_results(self, results):
        try:
            return xmltodict.parse(results)
        except ExpatError, e:
            print 'Error parsing encoding.com response'
            print e
            return None

Left on the todo list include HTTPS-only transmission with strict validation of Encoding.com's SSL certificate and to write some unit tests (tickets, tickets and more tickets).

Stage 4: Download all new video formats

We have a periodic job running every 15 seconds via RabbitMQ which is checking up on the progress of the video transcoding:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class CheckUpOnThirdParties(PeriodicTask):
    run_every = timedelta(seconds=settings.THIRD_PARTY_CHECK_UP_INTERVAL)
    ...
    def _handle_encoding_com(self, slides):
        format_lookup = {
            'mp4': SlideVideoMedia.FORMAT_MP4,
            'webm': SlideVideoMedia.FORMAT_WEBM,
            'ogg': SlideVideoMedia.FORMAT_OGG,
            'fl9': SlideVideoMedia.FORMAT_FL9,
            'thumbnail': SlideVideoMedia.FORMAT_THUMB,
        }
 
        encoder = Encoding(settings.ENCODING_API_USER_ID,
            settings.ENCODING_API_USER_KEY)
 
        job_ids = [item.encoding_com_tracking_code for item in slides]
        resp = encoder.get_status(ids=job_ids)
 
        if resp is None:
            log.error('Unable to check up on encoding.com')
            return False
class CheckUpOnThirdParties(PeriodicTask):
    run_every = timedelta(seconds=settings.THIRD_PARTY_CHECK_UP_INTERVAL)
    ...
    def _handle_encoding_com(self, slides):
        format_lookup = {
            'mp4': SlideVideoMedia.FORMAT_MP4,
            'webm': SlideVideoMedia.FORMAT_WEBM,
            'ogg': SlideVideoMedia.FORMAT_OGG,
            'fl9': SlideVideoMedia.FORMAT_FL9,
            'thumbnail': SlideVideoMedia.FORMAT_THUMB,
        }

        encoder = Encoding(settings.ENCODING_API_USER_ID,
            settings.ENCODING_API_USER_KEY)

        job_ids = [item.encoding_com_tracking_code for item in slides]
        resp = encoder.get_status(ids=job_ids)

        if resp is None:
            log.error('Unable to check up on encoding.com')
            return False

We'll go through the response from Encoding.com validating each piece of the response as we go along:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
if resp.get('response') is None:
    log.error('Unable to get response node from encoding.com')
    return False
 
resp_id = resp.get('response').get('id')
 
if resp_id is None:
    log.error('Unable to get media id from encoding.com')
    return False
 
slide = SlideUploadQueue.objects.filter(
    status=SlideUploadQueue.STATUS_AWAITING_3RD_PARTY_PROCESSING,
    encoding_com_tracking_code=resp_id)
 
if len(slide) != 1:
    log.error('Unable to find a single record for %s' % resp_id)
    return False
 
resp_status = resp.get('response').get('status')
 
if resp_status is None:
    log.error('Unable to get status from encoding.com')
    return False
 
if resp_status != u'Finished':
    log.debug("%s isn't finished, will check back later" % resp_id)
    return True
 
formats = resp.get('response').get('format')
 
if formats is None:
    log.error("No output formats were found. Something's wrong.")
    return False
 
for format in formats:
    try:
        assert format.get('status') == u'Finished', \
        "%s is not finished. Something's wrong." % format.get('id')
 
        output = format.get('output')
        assert output in ('mp4', 'webm', 'ogg', 'fl9',
            'thumbnail'), 'Unknown output format %s' % output
 
        s3_dest = format.get('s3_destination')
        assert 'http://encoding.com.result.s3.amazonaws.com/'\
            in s3_dest, 'Suspicious S3 url: %s' % s3_dest
 
        https_link = \
            'https://s3.amazonaws.com/encoding.com.result/%s' %\
            s3_dest.split('/')[-1]
        file_ext = https_link.split('.')[-1].strip()
 
        assert len(file_ext) > 0,\
            'Unable to get file extension from %s' % https_link
 
        count = SlideVideoMedia.objects.filter(slide_upload=slide,
            format=format_lookup[output]).count()
 
        if count != 0:
            print 'There is already a %s file for this slide' % output
            continue
 
        content = self.download_content(https_link)
 
        assert content is not None,\
            'There is no content for %s' % format.get('id')
    except AssertionError, e:
        log.error('A format did not pass all assertions: %s' % e)
        continue
if resp.get('response') is None:
    log.error('Unable to get response node from encoding.com')
    return False

resp_id = resp.get('response').get('id')

if resp_id is None:
    log.error('Unable to get media id from encoding.com')
    return False

slide = SlideUploadQueue.objects.filter(
    status=SlideUploadQueue.STATUS_AWAITING_3RD_PARTY_PROCESSING,
    encoding_com_tracking_code=resp_id)

if len(slide) != 1:
    log.error('Unable to find a single record for %s' % resp_id)
    return False

resp_status = resp.get('response').get('status')

if resp_status is None:
    log.error('Unable to get status from encoding.com')
    return False

if resp_status != u'Finished':
    log.debug("%s isn't finished, will check back later" % resp_id)
    return True

formats = resp.get('response').get('format')

if formats is None:
    log.error("No output formats were found. Something's wrong.")
    return False

for format in formats:
    try:
        assert format.get('status') == u'Finished', \
        "%s is not finished. Something's wrong." % format.get('id')

        output = format.get('output')
        assert output in ('mp4', 'webm', 'ogg', 'fl9',
            'thumbnail'), 'Unknown output format %s' % output

        s3_dest = format.get('s3_destination')
        assert 'http://encoding.com.result.s3.amazonaws.com/'\
            in s3_dest, 'Suspicious S3 url: %s' % s3_dest

        https_link = \
            'https://s3.amazonaws.com/encoding.com.result/%s' %\
            s3_dest.split('/')[-1]
        file_ext = https_link.split('.')[-1].strip()

        assert len(file_ext) > 0,\
            'Unable to get file extension from %s' % https_link

        count = SlideVideoMedia.objects.filter(slide_upload=slide,
            format=format_lookup[output]).count()

        if count != 0:
            print 'There is already a %s file for this slide' % output
            continue

        content = self.download_content(https_link)

        assert content is not None,\
            'There is no content for %s' % format.get('id')
    except AssertionError, e:
        log.error('A format did not pass all assertions: %s' % e)
        continue

At this point we've asserted everything is as it should be a we can save each of the videos:

1
2
3
4
media = SlideVideoMedia()
media.format = format_lookup[output]
media.converted_file.save('blah.%s' % file_ext, ContentFile(content))
media.save()
media = SlideVideoMedia()
media.format = format_lookup[output]
media.converted_file.save('blah.%s' % file_ext, ContentFile(content))
media.save()

Stage 5: Video via HTML5

On our frontend we've created a page with an HTML5 video element. We're using video.js to display the video in the best-supported format for each browser.

➫ bower install video.js
bower caching git://github.com/videojs/video.js-component.git
bower cloning git://github.com/videojs/video.js-component.git
bower fetching video.js
bower checking out video.js#v4.0.3
bower copying /home/mark/.bower/cache/video.js/5ab058cd60c5615aa38e8e706cd0f307
bower installing video.js#4.0.3

In our index.jade file we include it's dependencies:

1
2
3
4
5
6
7
!!! 5
html(lang="en", class="no-js")
  head
    meta(http-equiv='Content-Type', content='text/html; charset=UTF-8')
    ...
    link(rel='stylesheet', type='text/css', href='/components/video-js-4.1.0/video-js.css')
    script(type='text/javascript', src='/components/video-js-4.1.0/video.js')
!!! 5
html(lang="en", class="no-js")
  head
    meta(http-equiv='Content-Type', content='text/html; charset=UTF-8')
    ...
    link(rel='stylesheet', type='text/css', href='/components/video-js-4.1.0/video-js.css')
    script(type='text/javascript', src='/components/video-js-4.1.0/video.js')

In a Angular.js/JADE-based template we've included a <video> tag and it's <source> children tags. There is also a poster element that will show a static image of the video we've transcoded from the first few moments of the video.

1
2
3
#main.span12
    video#example_video_1.video-js.vjs-default-skin(controls, preload="auto", width="640", height="264", poster="{{video_thumbnail}}", data-setup='{"example_option":true}', ng-show="videos")
        source(ng-repeat="video in videos", src="{{video.src}}", type="{{video.type}}")
#main.span12
    video#example_video_1.video-js.vjs-default-skin(controls, preload="auto", width="640", height="264", poster="{{video_thumbnail}}", data-setup='{"example_option":true}', ng-show="videos")
        source(ng-repeat="video in videos", src="{{video.src}}", type="{{video.type}}")

This will print out every video format we've converted into, each as it's own <source> tag. Video.js will decide which of them to play based on the browser the user is using.

We still have a lot of work to do around fallback support, building unit tests and improving the robustness of our Encoding.com service wrapper. If this sort of work interests you please do get in touch.

文章节选:http://techblog.stickyworld.com/video-with-python.html

给我留言

留言无头像?


×
腾讯微博