mirror of
https://github.com/yt-dlp/yt-dlp.git
synced 2026-03-23 18:22:09 +01:00
Compare commits
12 Commits
2021.03.21
...
2021.03.24
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
a3affbe6a0 | ||
|
|
1418a0437f | ||
|
|
143db31d48 | ||
|
|
3700c7ef10 | ||
|
|
498f560638 | ||
|
|
394dcd4486 | ||
|
|
83b20a970d | ||
|
|
e1feb88fdf | ||
|
|
389b9dbbcc | ||
|
|
a7f347d9c9 | ||
|
|
421a459573 | ||
|
|
c224251aad |
6
.github/ISSUE_TEMPLATE/1_broken_site.md
vendored
6
.github/ISSUE_TEMPLATE/1_broken_site.md
vendored
@@ -21,7 +21,7 @@ assignees: ''
|
||||
|
||||
<!--
|
||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
|
||||
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.15. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
|
||||
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.21. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
|
||||
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
|
||||
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in https://github.com/yt-dlp/yt-dlp.
|
||||
- Search the bugtracker for similar issues: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
|
||||
@@ -29,7 +29,7 @@ Carefully read and work through this check list in order to prevent the most com
|
||||
-->
|
||||
|
||||
- [ ] I'm reporting a broken site support
|
||||
- [ ] I've verified that I'm running yt-dlp version **2021.03.15**
|
||||
- [ ] I've verified that I'm running yt-dlp version **2021.03.21**
|
||||
- [ ] I've checked that all provided URLs are alive and playable in a browser
|
||||
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
|
||||
- [ ] I've searched the bugtracker for similar issues including closed ones
|
||||
@@ -44,7 +44,7 @@ Add the `-v` flag to your command line you run yt-dlp with (`yt-dlp -v <your com
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] yt-dlp version 2021.03.15
|
||||
[debug] yt-dlp version 2021.03.21
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
|
||||
@@ -21,7 +21,7 @@ assignees: ''
|
||||
|
||||
<!--
|
||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
|
||||
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.15. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
|
||||
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.21. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
|
||||
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
|
||||
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://github.com/yt-dlp/yt-dlp. yt-dlp does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
|
||||
- Search the bugtracker for similar site support requests: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
|
||||
@@ -29,7 +29,7 @@ Carefully read and work through this check list in order to prevent the most com
|
||||
-->
|
||||
|
||||
- [ ] I'm reporting a new site support request
|
||||
- [ ] I've verified that I'm running yt-dlp version **2021.03.15**
|
||||
- [ ] I've verified that I'm running yt-dlp version **2021.03.21**
|
||||
- [ ] I've checked that all provided URLs are alive and playable in a browser
|
||||
- [ ] I've checked that none of provided URLs violate any copyrights
|
||||
- [ ] I've searched the bugtracker for similar site support requests including closed ones
|
||||
|
||||
@@ -21,13 +21,13 @@ assignees: ''
|
||||
|
||||
<!--
|
||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
|
||||
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.15. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
|
||||
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.21. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
|
||||
- Search the bugtracker for similar site feature requests: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
|
||||
- Finally, put x into all relevant boxes like this [x] (Dont forget to delete the empty space)
|
||||
-->
|
||||
|
||||
- [ ] I'm reporting a site feature request
|
||||
- [ ] I've verified that I'm running yt-dlp version **2021.03.15**
|
||||
- [ ] I've verified that I'm running yt-dlp version **2021.03.21**
|
||||
- [ ] I've searched the bugtracker for similar site feature requests including closed ones
|
||||
|
||||
|
||||
|
||||
6
.github/ISSUE_TEMPLATE/4_bug_report.md
vendored
6
.github/ISSUE_TEMPLATE/4_bug_report.md
vendored
@@ -21,7 +21,7 @@ assignees: ''
|
||||
|
||||
<!--
|
||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
|
||||
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.15. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
|
||||
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.21. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
|
||||
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
|
||||
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in https://github.com/yt-dlp/yt-dlp.
|
||||
- Search the bugtracker for similar issues: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
|
||||
@@ -30,7 +30,7 @@ Carefully read and work through this check list in order to prevent the most com
|
||||
-->
|
||||
|
||||
- [ ] I'm reporting a broken site support issue
|
||||
- [ ] I've verified that I'm running yt-dlp version **2021.03.15**
|
||||
- [ ] I've verified that I'm running yt-dlp version **2021.03.21**
|
||||
- [ ] I've checked that all provided URLs are alive and playable in a browser
|
||||
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
|
||||
- [ ] I've searched the bugtracker for similar bug reports including closed ones
|
||||
@@ -46,7 +46,7 @@ Add the `-v` flag to your command line you run yt-dlp with (`yt-dlp -v <your com
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] yt-dlp version 2021.03.15
|
||||
[debug] yt-dlp version 2021.03.21
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
|
||||
4
.github/ISSUE_TEMPLATE/5_feature_request.md
vendored
4
.github/ISSUE_TEMPLATE/5_feature_request.md
vendored
@@ -21,13 +21,13 @@ assignees: ''
|
||||
|
||||
<!--
|
||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
|
||||
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.15. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
|
||||
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.21. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
|
||||
- Search the bugtracker for similar feature requests: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
|
||||
- Finally, put x into all relevant boxes like this [x] (Dont forget to delete the empty space)
|
||||
-->
|
||||
|
||||
- [ ] I'm reporting a feature request
|
||||
- [ ] I've verified that I'm running yt-dlp version **2021.03.15**
|
||||
- [ ] I've verified that I'm running yt-dlp version **2021.03.21**
|
||||
- [ ] I've searched the bugtracker for similar feature requests including closed ones
|
||||
|
||||
|
||||
|
||||
@@ -31,3 +31,7 @@ DennyDai
|
||||
codeasashu
|
||||
teesid
|
||||
kevinoconnor7
|
||||
damianoamatruda
|
||||
2ShedsJackson
|
||||
CXwudi
|
||||
xtkoba
|
||||
|
||||
12
Changelog.md
12
Changelog.md
@@ -17,9 +17,19 @@
|
||||
-->
|
||||
|
||||
|
||||
### 2021.03.24
|
||||
* Merge youtube-dl: Upto [commit/8562218](https://github.com/ytdl-org/youtube-dl/commit/8562218350a79d4709da8593bb0c538aa0824acf)
|
||||
* Parse metadata from multiple fields using `--parse-metadata`
|
||||
* Ability to load playlist infojson using `--load-info-json`
|
||||
* Write current epoch to infojson when using `--no-clean-infojson`
|
||||
* [youtube_live_chat] fix bug when trying to set cookies
|
||||
* [niconico] Fix for when logged in by: @CXwudi and @xtkoba
|
||||
* [linuxacadamy] Fix login
|
||||
|
||||
|
||||
### 2021.03.21
|
||||
* Merge youtube-dl: Upto [commit/7e79ba7](https://github.com/ytdl-org/youtube-dl/commit/7e79ba7dd6e6649dd2ce3a74004b2044f2182881)
|
||||
* Option `--clean-infojson` to keep private keys in the infojson
|
||||
* Option `--no-clean-infojson` to keep private keys in the infojson
|
||||
* [aria2c] Support retry/abort unavailable fragments by [damianoamatruda](https://github.com/damianoamatruda)
|
||||
* [aria2c] Better default arguments
|
||||
* [movefiles] Fix bugs and make more robust
|
||||
|
||||
34
README.md
34
README.md
@@ -3,7 +3,7 @@
|
||||
[](https://github.com/yt-dlp/yt-dlp/releases/latest)
|
||||
[](LICENSE)
|
||||
[](https://github.com/yt-dlp/yt-dlp/actions)
|
||||
[](https://discord.gg/S75JaBna)
|
||||
[](https://discord.gg/H5MNcFW63r)
|
||||
|
||||
[](https://github.com/yt-dlp/yt-dlp/commits)
|
||||
[](https://github.com/yt-dlp/yt-dlp/commits)
|
||||
@@ -58,7 +58,7 @@ The major new features from the latest release of [blackjack4494/yt-dlc](https:/
|
||||
|
||||
* **[Format Sorting](#sorting-formats)**: The default format sorting options have been changed so that higher resolution and better codecs will be now preferred instead of simply using larger bitrate. Furthermore, you can now specify the sort order using `-S`. This allows for much easier format selection that what is possible by simply using `--format` ([examples](#format-selection-examples))
|
||||
|
||||
* **Merged with youtube-dl v2021.03.14**: You get all the latest features and patches of [youtube-dl](https://github.com/ytdl-org/youtube-dl) in addition to all the features of [youtube-dlc](https://github.com/blackjack4494/yt-dlc)
|
||||
* **Merged with youtube-dl v2021.03.25**: You get all the latest features and patches of [youtube-dl](https://github.com/ytdl-org/youtube-dl) in addition to all the features of [youtube-dlc](https://github.com/blackjack4494/yt-dlc)
|
||||
|
||||
* **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--get-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, Playlist infojson etc. Note that the NicoNico improvements are not available. See [#31](https://github.com/yt-dlp/yt-dlp/pull/31) for details.
|
||||
|
||||
@@ -670,18 +670,24 @@ Then simply run `make`. You can also run `make yt-dlp` instead to compile only t
|
||||
--add-metadata Write metadata to the video file
|
||||
--no-add-metadata Do not write metadata (default)
|
||||
--parse-metadata FIELD:FORMAT Parse additional metadata like title/artist
|
||||
from other fields. Give field name to
|
||||
extract data from, and format of the field
|
||||
seperated by a ":". Either regular
|
||||
expression with named capture groups or a
|
||||
similar syntax to the output template can
|
||||
also be used. The parsed parameters replace
|
||||
any existing values and can be use in
|
||||
output template. This option can be used
|
||||
multiple times. Example: --parse-metadata
|
||||
"title:%(artist)s - %(title)s" matches a
|
||||
title like "Coldplay - Paradise". Example
|
||||
(regex): --parse-metadata
|
||||
from other fields. Give a template or field
|
||||
name to extract data from and the format to
|
||||
interpret it as, seperated by a ":". Either
|
||||
regular expression with named capture
|
||||
groups or a similar syntax to the output
|
||||
template can be used for the FORMAT.
|
||||
Similarly, the syntax for output template
|
||||
can be used for FIELD to parse the data
|
||||
from multiple fields. The parsed parameters
|
||||
replace any existing values and can be used
|
||||
in output templates. This option can be
|
||||
used multiple times. Example: --parse-
|
||||
metadata "title:%(artist)s - %(title)s"
|
||||
matches a title like "Coldplay - Paradise".
|
||||
Example: --parse-metadata "%(series)s
|
||||
%(episode_number)s:%(title)s" sets the
|
||||
title using series and episode number.
|
||||
Example (regex): --parse-metadata
|
||||
"description:Artist - (?P<artist>.+?)"
|
||||
--xattrs Write metadata to the video file's xattrs
|
||||
(using dublin core and xdg standards)
|
||||
|
||||
@@ -97,7 +97,8 @@
|
||||
- **bbc**: BBC
|
||||
- **bbc.co.uk**: BBC iPlayer
|
||||
- **bbc.co.uk:article**: BBC articles
|
||||
- **bbc.co.uk:iplayer:playlist**
|
||||
- **bbc.co.uk:iplayer:episodes**
|
||||
- **bbc.co.uk:iplayer:group**
|
||||
- **bbc.co.uk:playlist**
|
||||
- **BBVTV**
|
||||
- **Beatport**
|
||||
@@ -1251,5 +1252,6 @@
|
||||
- **zee5:series**
|
||||
- **Zhihu**
|
||||
- **zingmp3**: mp3.zing.vn
|
||||
- **zingmp3:album**
|
||||
- **zoom**
|
||||
- **Zype**
|
||||
|
||||
@@ -60,12 +60,14 @@ from .utils import (
|
||||
encode_compat_str,
|
||||
encodeFilename,
|
||||
error_to_compat_str,
|
||||
EntryNotInPlaylist,
|
||||
ExistingVideoReached,
|
||||
expand_path,
|
||||
ExtractorError,
|
||||
float_or_none,
|
||||
format_bytes,
|
||||
format_field,
|
||||
FORMAT_RE,
|
||||
formatSeconds,
|
||||
GeoRestrictedError,
|
||||
int_or_none,
|
||||
@@ -771,95 +773,93 @@ class YoutubeDL(object):
|
||||
'Put from __future__ import unicode_literals at the top of your code file or consider switching to Python 3.x.')
|
||||
return outtmpl_dict
|
||||
|
||||
def prepare_outtmpl(self, outtmpl, info_dict, sanitize=None):
|
||||
""" Make the template and info_dict suitable for substitution (outtmpl % info_dict)"""
|
||||
template_dict = dict(info_dict)
|
||||
|
||||
# duration_string
|
||||
template_dict['duration_string'] = ( # %(duration>%H-%M-%S)s is wrong if duration > 24hrs
|
||||
formatSeconds(info_dict['duration'], '-')
|
||||
if info_dict.get('duration', None) is not None
|
||||
else None)
|
||||
|
||||
# epoch
|
||||
template_dict['epoch'] = int(time.time())
|
||||
|
||||
# autonumber
|
||||
autonumber_size = self.params.get('autonumber_size')
|
||||
if autonumber_size is None:
|
||||
autonumber_size = 5
|
||||
template_dict['autonumber'] = self.params.get('autonumber_start', 1) - 1 + self._num_downloads
|
||||
|
||||
# resolution if not defined
|
||||
if template_dict.get('resolution') is None:
|
||||
if template_dict.get('width') and template_dict.get('height'):
|
||||
template_dict['resolution'] = '%dx%d' % (template_dict['width'], template_dict['height'])
|
||||
elif template_dict.get('height'):
|
||||
template_dict['resolution'] = '%sp' % template_dict['height']
|
||||
elif template_dict.get('width'):
|
||||
template_dict['resolution'] = '%dx?' % template_dict['width']
|
||||
|
||||
if sanitize is None:
|
||||
sanitize = lambda k, v: v
|
||||
template_dict = dict((k, v if isinstance(v, compat_numeric_types) else sanitize(k, v))
|
||||
for k, v in template_dict.items()
|
||||
if v is not None and not isinstance(v, (list, tuple, dict)))
|
||||
na = self.params.get('outtmpl_na_placeholder', 'NA')
|
||||
template_dict = collections.defaultdict(lambda: na, template_dict)
|
||||
|
||||
# For fields playlist_index and autonumber convert all occurrences
|
||||
# of %(field)s to %(field)0Nd for backward compatibility
|
||||
field_size_compat_map = {
|
||||
'playlist_index': len(str(template_dict['n_entries'])),
|
||||
'autonumber': autonumber_size,
|
||||
}
|
||||
FIELD_SIZE_COMPAT_RE = r'(?<!%)%\((?P<field>autonumber|playlist_index)\)s'
|
||||
mobj = re.search(FIELD_SIZE_COMPAT_RE, outtmpl)
|
||||
if mobj:
|
||||
outtmpl = re.sub(
|
||||
FIELD_SIZE_COMPAT_RE,
|
||||
r'%%(\1)0%dd' % field_size_compat_map[mobj.group('field')],
|
||||
outtmpl)
|
||||
|
||||
numeric_fields = list(self._NUMERIC_FIELDS)
|
||||
|
||||
# Format date
|
||||
FORMAT_DATE_RE = FORMAT_RE.format(r'(?P<key>(?P<field>\w+)>(?P<format>.+?))')
|
||||
for mobj in re.finditer(FORMAT_DATE_RE, outtmpl):
|
||||
conv_type, field, frmt, key = mobj.group('type', 'field', 'format', 'key')
|
||||
if key in template_dict:
|
||||
continue
|
||||
value = strftime_or_none(template_dict.get(field), frmt, na)
|
||||
if conv_type in 'crs': # string
|
||||
value = sanitize(field, value)
|
||||
else: # number
|
||||
numeric_fields.append(key)
|
||||
value = float_or_none(value, default=None)
|
||||
if value is not None:
|
||||
template_dict[key] = value
|
||||
|
||||
# Missing numeric fields used together with integer presentation types
|
||||
# in format specification will break the argument substitution since
|
||||
# string NA placeholder is returned for missing fields. We will patch
|
||||
# output template for missing fields to meet string presentation type.
|
||||
for numeric_field in numeric_fields:
|
||||
if numeric_field not in template_dict:
|
||||
outtmpl = re.sub(
|
||||
FORMAT_RE.format(re.escape(numeric_field)),
|
||||
r'%({0})s'.format(numeric_field), outtmpl)
|
||||
|
||||
return outtmpl, template_dict
|
||||
|
||||
def _prepare_filename(self, info_dict, tmpl_type='default'):
|
||||
try:
|
||||
template_dict = dict(info_dict)
|
||||
|
||||
template_dict['duration_string'] = ( # %(duration>%H-%M-%S)s is wrong if duration > 24hrs
|
||||
formatSeconds(info_dict['duration'], '-')
|
||||
if info_dict.get('duration', None) is not None
|
||||
else None)
|
||||
|
||||
template_dict['epoch'] = int(time.time())
|
||||
autonumber_size = self.params.get('autonumber_size')
|
||||
if autonumber_size is None:
|
||||
autonumber_size = 5
|
||||
template_dict['autonumber'] = self.params.get('autonumber_start', 1) - 1 + self._num_downloads
|
||||
if template_dict.get('resolution') is None:
|
||||
if template_dict.get('width') and template_dict.get('height'):
|
||||
template_dict['resolution'] = '%dx%d' % (template_dict['width'], template_dict['height'])
|
||||
elif template_dict.get('height'):
|
||||
template_dict['resolution'] = '%sp' % template_dict['height']
|
||||
elif template_dict.get('width'):
|
||||
template_dict['resolution'] = '%dx?' % template_dict['width']
|
||||
|
||||
sanitize = lambda k, v: sanitize_filename(
|
||||
compat_str(v),
|
||||
restricted=self.params.get('restrictfilenames'),
|
||||
is_id=(k == 'id' or k.endswith('_id')))
|
||||
template_dict = dict((k, v if isinstance(v, compat_numeric_types) else sanitize(k, v))
|
||||
for k, v in template_dict.items()
|
||||
if v is not None and not isinstance(v, (list, tuple, dict)))
|
||||
na = self.params.get('outtmpl_na_placeholder', 'NA')
|
||||
template_dict = collections.defaultdict(lambda: na, template_dict)
|
||||
|
||||
outtmpl = self.outtmpl_dict.get(tmpl_type, self.outtmpl_dict['default'])
|
||||
force_ext = OUTTMPL_TYPES.get(tmpl_type)
|
||||
|
||||
# For fields playlist_index and autonumber convert all occurrences
|
||||
# of %(field)s to %(field)0Nd for backward compatibility
|
||||
field_size_compat_map = {
|
||||
'playlist_index': len(str(template_dict['n_entries'])),
|
||||
'autonumber': autonumber_size,
|
||||
}
|
||||
FIELD_SIZE_COMPAT_RE = r'(?<!%)%\((?P<field>autonumber|playlist_index)\)s'
|
||||
mobj = re.search(FIELD_SIZE_COMPAT_RE, outtmpl)
|
||||
if mobj:
|
||||
outtmpl = re.sub(
|
||||
FIELD_SIZE_COMPAT_RE,
|
||||
r'%%(\1)0%dd' % field_size_compat_map[mobj.group('field')],
|
||||
outtmpl)
|
||||
|
||||
# As of [1] format syntax is:
|
||||
# %[mapping_key][conversion_flags][minimum_width][.precision][length_modifier]type
|
||||
# 1. https://docs.python.org/2/library/stdtypes.html#string-formatting
|
||||
FORMAT_RE = r'''(?x)
|
||||
(?<!%)
|
||||
%
|
||||
\({0}\) # mapping key
|
||||
(?:[#0\-+ ]+)? # conversion flags (optional)
|
||||
(?:\d+)? # minimum field width (optional)
|
||||
(?:\.\d+)? # precision (optional)
|
||||
[hlL]? # length modifier (optional)
|
||||
(?P<type>[diouxXeEfFgGcrs%]) # conversion type
|
||||
'''
|
||||
|
||||
numeric_fields = list(self._NUMERIC_FIELDS)
|
||||
|
||||
# Format date
|
||||
FORMAT_DATE_RE = FORMAT_RE.format(r'(?P<key>(?P<field>\w+)>(?P<format>.+?))')
|
||||
for mobj in re.finditer(FORMAT_DATE_RE, outtmpl):
|
||||
conv_type, field, frmt, key = mobj.group('type', 'field', 'format', 'key')
|
||||
if key in template_dict:
|
||||
continue
|
||||
value = strftime_or_none(template_dict.get(field), frmt, na)
|
||||
if conv_type in 'crs': # string
|
||||
value = sanitize(field, value)
|
||||
else: # number
|
||||
numeric_fields.append(key)
|
||||
value = float_or_none(value, default=None)
|
||||
if value is not None:
|
||||
template_dict[key] = value
|
||||
|
||||
# Missing numeric fields used together with integer presentation types
|
||||
# in format specification will break the argument substitution since
|
||||
# string NA placeholder is returned for missing fields. We will patch
|
||||
# output template for missing fields to meet string presentation type.
|
||||
for numeric_field in numeric_fields:
|
||||
if numeric_field not in template_dict:
|
||||
outtmpl = re.sub(
|
||||
FORMAT_RE.format(re.escape(numeric_field)),
|
||||
r'%({0})s'.format(numeric_field), outtmpl)
|
||||
outtmpl, template_dict = self.prepare_outtmpl(outtmpl, info_dict, sanitize)
|
||||
|
||||
# expand_path translates '%%' into '%' and '$$' into '$'
|
||||
# correspondingly that is not what we want since we need to keep
|
||||
@@ -874,6 +874,7 @@ class YoutubeDL(object):
|
||||
# title "Hello $PATH", we don't want `$PATH` to be expanded.
|
||||
filename = expand_path(outtmpl).replace(sep, '') % template_dict
|
||||
|
||||
force_ext = OUTTMPL_TYPES.get(tmpl_type)
|
||||
if force_ext is not None:
|
||||
filename = replace_extension(filename, force_ext, template_dict.get('ext'))
|
||||
|
||||
@@ -1180,48 +1181,16 @@ class YoutubeDL(object):
|
||||
playlist = ie_result.get('title') or ie_result.get('id')
|
||||
self.to_screen('[download] Downloading playlist: %s' % playlist)
|
||||
|
||||
if self.params.get('allow_playlist_files', True):
|
||||
ie_copy = {
|
||||
'playlist': playlist,
|
||||
'playlist_id': ie_result.get('id'),
|
||||
'playlist_title': ie_result.get('title'),
|
||||
'playlist_uploader': ie_result.get('uploader'),
|
||||
'playlist_uploader_id': ie_result.get('uploader_id'),
|
||||
'playlist_index': 0
|
||||
}
|
||||
ie_copy.update(dict(ie_result))
|
||||
|
||||
if self.params.get('writeinfojson', False):
|
||||
infofn = self.prepare_filename(ie_copy, 'pl_infojson')
|
||||
if not self._ensure_dir_exists(encodeFilename(infofn)):
|
||||
return
|
||||
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(infofn)):
|
||||
self.to_screen('[info] Playlist metadata is already present')
|
||||
else:
|
||||
playlist_info = dict(ie_result)
|
||||
# playlist_info['entries'] = list(playlist_info['entries']) # Entries is a generator which shouldnot be resolved here
|
||||
self.to_screen('[info] Writing playlist metadata as JSON to: ' + infofn)
|
||||
try:
|
||||
write_json_file(self.filter_requested_info(playlist_info, self.params.get('clean_infojson', True)), infofn)
|
||||
except (OSError, IOError):
|
||||
self.report_error('Cannot write playlist metadata to JSON file ' + infofn)
|
||||
|
||||
if self.params.get('writedescription', False):
|
||||
descfn = self.prepare_filename(ie_copy, 'pl_description')
|
||||
if not self._ensure_dir_exists(encodeFilename(descfn)):
|
||||
return
|
||||
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(descfn)):
|
||||
self.to_screen('[info] Playlist description is already present')
|
||||
elif ie_result.get('description') is None:
|
||||
self.report_warning('There\'s no playlist description to write.')
|
||||
else:
|
||||
try:
|
||||
self.to_screen('[info] Writing playlist description to: ' + descfn)
|
||||
with io.open(encodeFilename(descfn), 'w', encoding='utf-8') as descfile:
|
||||
descfile.write(ie_result['description'])
|
||||
except (OSError, IOError):
|
||||
self.report_error('Cannot write playlist description file ' + descfn)
|
||||
return
|
||||
if 'entries' not in ie_result:
|
||||
raise EntryNotInPlaylist()
|
||||
incomplete_entries = bool(ie_result.get('requested_entries'))
|
||||
if incomplete_entries:
|
||||
def fill_missing_entries(entries, indexes):
|
||||
ret = [None] * max(*indexes)
|
||||
for i, entry in zip(indexes, entries):
|
||||
ret[i - 1] = entry
|
||||
return ret
|
||||
ie_result['entries'] = fill_missing_entries(ie_result['entries'], ie_result['requested_entries'])
|
||||
|
||||
playlist_results = []
|
||||
|
||||
@@ -1248,25 +1217,20 @@ class YoutubeDL(object):
|
||||
|
||||
def make_playlistitems_entries(list_ie_entries):
|
||||
num_entries = len(list_ie_entries)
|
||||
return [
|
||||
list_ie_entries[i - 1] for i in playlistitems
|
||||
if -num_entries <= i - 1 < num_entries]
|
||||
|
||||
def report_download(num_entries):
|
||||
self.to_screen(
|
||||
'[%s] playlist %s: Downloading %d videos' %
|
||||
(ie_result['extractor'], playlist, num_entries))
|
||||
for i in playlistitems:
|
||||
if -num_entries < i <= num_entries:
|
||||
yield list_ie_entries[i - 1]
|
||||
elif incomplete_entries:
|
||||
raise EntryNotInPlaylist()
|
||||
|
||||
if isinstance(ie_entries, list):
|
||||
n_all_entries = len(ie_entries)
|
||||
if playlistitems:
|
||||
entries = make_playlistitems_entries(ie_entries)
|
||||
entries = list(make_playlistitems_entries(ie_entries))
|
||||
else:
|
||||
entries = ie_entries[playliststart:playlistend]
|
||||
n_entries = len(entries)
|
||||
self.to_screen(
|
||||
'[%s] playlist %s: Collected %d video ids (downloading %d of them)' %
|
||||
(ie_result['extractor'], playlist, n_all_entries, n_entries))
|
||||
msg = 'Collected %d videos; downloading %d of them' % (n_all_entries, n_entries)
|
||||
elif isinstance(ie_entries, PagedList):
|
||||
if playlistitems:
|
||||
entries = []
|
||||
@@ -1278,25 +1242,73 @@ class YoutubeDL(object):
|
||||
entries = ie_entries.getslice(
|
||||
playliststart, playlistend)
|
||||
n_entries = len(entries)
|
||||
report_download(n_entries)
|
||||
msg = 'Downloading %d videos' % n_entries
|
||||
else: # iterable
|
||||
if playlistitems:
|
||||
entries = make_playlistitems_entries(list(itertools.islice(
|
||||
ie_entries, 0, max(playlistitems))))
|
||||
entries = list(make_playlistitems_entries(list(itertools.islice(
|
||||
ie_entries, 0, max(playlistitems)))))
|
||||
else:
|
||||
entries = list(itertools.islice(
|
||||
ie_entries, playliststart, playlistend))
|
||||
n_entries = len(entries)
|
||||
report_download(n_entries)
|
||||
msg = 'Downloading %d videos' % n_entries
|
||||
|
||||
if any((entry is None for entry in entries)):
|
||||
raise EntryNotInPlaylist()
|
||||
if not playlistitems and (playliststart or playlistend):
|
||||
playlistitems = list(range(1 + playliststart, 1 + playliststart + len(entries)))
|
||||
ie_result['entries'] = entries
|
||||
ie_result['requested_entries'] = playlistitems
|
||||
|
||||
if self.params.get('allow_playlist_files', True):
|
||||
ie_copy = {
|
||||
'playlist': playlist,
|
||||
'playlist_id': ie_result.get('id'),
|
||||
'playlist_title': ie_result.get('title'),
|
||||
'playlist_uploader': ie_result.get('uploader'),
|
||||
'playlist_uploader_id': ie_result.get('uploader_id'),
|
||||
'playlist_index': 0
|
||||
}
|
||||
ie_copy.update(dict(ie_result))
|
||||
|
||||
if self.params.get('writeinfojson', False):
|
||||
infofn = self.prepare_filename(ie_copy, 'pl_infojson')
|
||||
if not self._ensure_dir_exists(encodeFilename(infofn)):
|
||||
return
|
||||
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(infofn)):
|
||||
self.to_screen('[info] Playlist metadata is already present')
|
||||
else:
|
||||
self.to_screen('[info] Writing playlist metadata as JSON to: ' + infofn)
|
||||
try:
|
||||
write_json_file(self.filter_requested_info(ie_result, self.params.get('clean_infojson', True)), infofn)
|
||||
except (OSError, IOError):
|
||||
self.report_error('Cannot write playlist metadata to JSON file ' + infofn)
|
||||
|
||||
if self.params.get('writedescription', False):
|
||||
descfn = self.prepare_filename(ie_copy, 'pl_description')
|
||||
if not self._ensure_dir_exists(encodeFilename(descfn)):
|
||||
return
|
||||
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(descfn)):
|
||||
self.to_screen('[info] Playlist description is already present')
|
||||
elif ie_result.get('description') is None:
|
||||
self.report_warning('There\'s no playlist description to write.')
|
||||
else:
|
||||
try:
|
||||
self.to_screen('[info] Writing playlist description to: ' + descfn)
|
||||
with io.open(encodeFilename(descfn), 'w', encoding='utf-8') as descfile:
|
||||
descfile.write(ie_result['description'])
|
||||
except (OSError, IOError):
|
||||
self.report_error('Cannot write playlist description file ' + descfn)
|
||||
return
|
||||
|
||||
if self.params.get('playlistreverse', False):
|
||||
entries = entries[::-1]
|
||||
|
||||
if self.params.get('playlistrandom', False):
|
||||
random.shuffle(entries)
|
||||
|
||||
x_forwarded_for = ie_result.get('__x_forwarded_for_ip')
|
||||
|
||||
self.to_screen('[%s] playlist %s: %s' % (ie_result['extractor'], playlist, msg))
|
||||
for i, entry in enumerate(entries, 1):
|
||||
self.to_screen('[download] Downloading video %s of %s' % (i, n_entries))
|
||||
# This __x_forwarded_for_ip thing is a bit ugly but requires
|
||||
@@ -1310,7 +1322,7 @@ class YoutubeDL(object):
|
||||
'playlist_title': ie_result.get('title'),
|
||||
'playlist_uploader': ie_result.get('uploader'),
|
||||
'playlist_uploader_id': ie_result.get('uploader_id'),
|
||||
'playlist_index': playlistitems[i - 1] if playlistitems else i + playliststart,
|
||||
'playlist_index': playlistitems[i - 1] if playlistitems else i,
|
||||
'extractor': ie_result['extractor'],
|
||||
'webpage_url': ie_result['webpage_url'],
|
||||
'webpage_url_basename': url_basename(ie_result['webpage_url']),
|
||||
@@ -2524,10 +2536,10 @@ class YoutubeDL(object):
|
||||
[info_filename], mode='r',
|
||||
openhook=fileinput.hook_encoded('utf-8'))) as f:
|
||||
# FileInput doesn't have a read method, we can't call json.load
|
||||
info = self.filter_requested_info(json.loads('\n'.join(f)))
|
||||
info = self.filter_requested_info(json.loads('\n'.join(f)), self.params.get('clean_infojson', True))
|
||||
try:
|
||||
self.process_ie_result(info, download=True)
|
||||
except DownloadError:
|
||||
except (DownloadError, EntryNotInPlaylist):
|
||||
webpage_url = info.get('webpage_url')
|
||||
if webpage_url is not None:
|
||||
self.report_warning('The info failed to download, trying with "%s"' % webpage_url)
|
||||
@@ -2539,9 +2551,10 @@ class YoutubeDL(object):
|
||||
@staticmethod
|
||||
def filter_requested_info(info_dict, actually_filter=True):
|
||||
if not actually_filter:
|
||||
info_dict['epoch'] = int(time.time())
|
||||
return info_dict
|
||||
exceptions = {
|
||||
'remove': ['requested_formats', 'requested_subtitles', 'filepath', 'entries'],
|
||||
'remove': ['requested_formats', 'requested_subtitles', 'requested_entries', 'filepath', 'entries'],
|
||||
'keep': ['_type'],
|
||||
}
|
||||
keep_key = lambda k: k in exceptions['keep'] or not (k.startswith('_') or k in exceptions['remove'])
|
||||
|
||||
@@ -79,8 +79,7 @@ class YoutubeLiveChatReplayFD(FragmentFD):
|
||||
|
||||
self._prepare_and_start_frag_download(ctx)
|
||||
|
||||
success, raw_fragment = dl_fragment(
|
||||
'https://www.youtube.com/watch?v={}'.format(video_id))
|
||||
success, raw_fragment = dl_fragment(info_dict['url'])
|
||||
if not success:
|
||||
return False
|
||||
try:
|
||||
|
||||
@@ -413,6 +413,12 @@ class ARDBetaMediathekIE(ARDMediathekBaseIE):
|
||||
# playlist of type 'sammlung'
|
||||
'url': 'https://www.ardmediathek.de/ard/sammlung/team-muenster/5JpTzLSbWUAK8184IOvEir/',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.ardmediathek.de/video/coronavirus-update-ndr-info/astrazeneca-kurz-lockdown-und-pims-syndrom-81/ndr/Y3JpZDovL25kci5kZS84NzE0M2FjNi0wMWEwLTQ5ODEtOTE5NS1mOGZhNzdhOTFmOTI/',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.ardmediathek.de/ard/player/Y3JpZDovL3dkci5kZS9CZWl0cmFnLWQ2NDJjYWEzLTMwZWYtNGI4NS1iMTI2LTU1N2UxYTcxOGIzOQ/tatort-duo-koeln-leipzig-ihr-kinderlein-kommet',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _ARD_load_playlist_snipped(self, playlist_id, display_id, client, mode, pageNumber):
|
||||
@@ -512,13 +518,7 @@ class ARDBetaMediathekIE(ARDMediathekBaseIE):
|
||||
return self.playlist_result(entries, playlist_title=display_id)
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('video_id')
|
||||
display_id = mobj.group('display_id')
|
||||
if display_id:
|
||||
display_id = display_id.rstrip('/')
|
||||
if not display_id:
|
||||
display_id = video_id
|
||||
video_id = self._match_id(url)
|
||||
|
||||
if mobj.group('mode') in ('sendung', 'sammlung'):
|
||||
# this is a playlist-URL
|
||||
@@ -529,9 +529,9 @@ class ARDBetaMediathekIE(ARDMediathekBaseIE):
|
||||
|
||||
player_page = self._download_json(
|
||||
'https://api.ardmediathek.de/public-gateway',
|
||||
display_id, data=json.dumps({
|
||||
video_id, data=json.dumps({
|
||||
'query': '''{
|
||||
playerPage(client:"%s", clipId: "%s") {
|
||||
playerPage(client: "ard", clipId: "%s") {
|
||||
blockedByFsk
|
||||
broadcastedOn
|
||||
maturityContentRating
|
||||
@@ -561,7 +561,7 @@ class ARDBetaMediathekIE(ARDMediathekBaseIE):
|
||||
}
|
||||
}
|
||||
}
|
||||
}''' % (mobj.group('client'), video_id),
|
||||
}''' % video_id,
|
||||
}).encode(), headers={
|
||||
'Content-Type': 'application/json'
|
||||
})['data']['playerPage']
|
||||
@@ -586,7 +586,6 @@ class ARDBetaMediathekIE(ARDMediathekBaseIE):
|
||||
r'\(FSK\s*(\d+)\)\s*$', description, 'age limit', default=None))
|
||||
info.update({
|
||||
'age_limit': age_limit,
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'timestamp': unified_timestamp(player_page.get('broadcastedOn')),
|
||||
|
||||
@@ -1,17 +1,22 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import functools
|
||||
import itertools
|
||||
import json
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_etree_Element,
|
||||
compat_HTTPError,
|
||||
compat_parse_qs,
|
||||
compat_urllib_parse_urlparse,
|
||||
compat_urlparse,
|
||||
)
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
OnDemandPagedList,
|
||||
clean_html,
|
||||
dict_get,
|
||||
float_or_none,
|
||||
@@ -811,7 +816,7 @@ class BBCIE(BBCCoUkIE):
|
||||
|
||||
@classmethod
|
||||
def suitable(cls, url):
|
||||
EXCLUDE_IE = (BBCCoUkIE, BBCCoUkArticleIE, BBCCoUkIPlayerPlaylistIE, BBCCoUkPlaylistIE)
|
||||
EXCLUDE_IE = (BBCCoUkIE, BBCCoUkArticleIE, BBCCoUkIPlayerEpisodesIE, BBCCoUkIPlayerGroupIE, BBCCoUkPlaylistIE)
|
||||
return (False if any(ie.suitable(url) for ie in EXCLUDE_IE)
|
||||
else super(BBCIE, cls).suitable(url))
|
||||
|
||||
@@ -1338,21 +1343,149 @@ class BBCCoUkPlaylistBaseIE(InfoExtractor):
|
||||
playlist_id, title, description)
|
||||
|
||||
|
||||
class BBCCoUkIPlayerPlaylistIE(BBCCoUkPlaylistBaseIE):
|
||||
IE_NAME = 'bbc.co.uk:iplayer:playlist'
|
||||
_VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/iplayer/(?:episodes|group)/(?P<id>%s)' % BBCCoUkIE._ID_REGEX
|
||||
_URL_TEMPLATE = 'http://www.bbc.co.uk/iplayer/episode/%s'
|
||||
_VIDEO_ID_TEMPLATE = r'data-ip-id=["\'](%s)'
|
||||
class BBCCoUkIPlayerPlaylistBaseIE(InfoExtractor):
|
||||
_VALID_URL_TMPL = r'https?://(?:www\.)?bbc\.co\.uk/iplayer/%%s/(?P<id>%s)' % BBCCoUkIE._ID_REGEX
|
||||
|
||||
@staticmethod
|
||||
def _get_default(episode, key, default_key='default'):
|
||||
return try_get(episode, lambda x: x[key][default_key])
|
||||
|
||||
def _get_description(self, data):
|
||||
synopsis = data.get(self._DESCRIPTION_KEY) or {}
|
||||
return dict_get(synopsis, ('large', 'medium', 'small'))
|
||||
|
||||
def _fetch_page(self, programme_id, per_page, series_id, page):
|
||||
elements = self._get_elements(self._call_api(
|
||||
programme_id, per_page, page + 1, series_id))
|
||||
for element in elements:
|
||||
episode = self._get_episode(element)
|
||||
episode_id = episode.get('id')
|
||||
if not episode_id:
|
||||
continue
|
||||
thumbnail = None
|
||||
image = self._get_episode_image(episode)
|
||||
if image:
|
||||
thumbnail = image.replace('{recipe}', 'raw')
|
||||
category = self._get_default(episode, 'labels', 'category')
|
||||
yield {
|
||||
'_type': 'url',
|
||||
'id': episode_id,
|
||||
'title': self._get_episode_field(episode, 'subtitle'),
|
||||
'url': 'https://www.bbc.co.uk/iplayer/episode/' + episode_id,
|
||||
'thumbnail': thumbnail,
|
||||
'description': self._get_description(episode),
|
||||
'categories': [category] if category else None,
|
||||
'series': self._get_episode_field(episode, 'title'),
|
||||
'ie_key': BBCCoUkIE.ie_key(),
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
pid = self._match_id(url)
|
||||
qs = compat_parse_qs(compat_urllib_parse_urlparse(url).query)
|
||||
series_id = qs.get('seriesId', [None])[0]
|
||||
page = qs.get('page', [None])[0]
|
||||
per_page = 36 if page else self._PAGE_SIZE
|
||||
fetch_page = functools.partial(self._fetch_page, pid, per_page, series_id)
|
||||
entries = fetch_page(int(page) - 1) if page else OnDemandPagedList(fetch_page, self._PAGE_SIZE)
|
||||
playlist_data = self._get_playlist_data(self._call_api(pid, 1))
|
||||
return self.playlist_result(
|
||||
entries, pid, self._get_playlist_title(playlist_data),
|
||||
self._get_description(playlist_data))
|
||||
|
||||
|
||||
class BBCCoUkIPlayerEpisodesIE(BBCCoUkIPlayerPlaylistBaseIE):
|
||||
IE_NAME = 'bbc.co.uk:iplayer:episodes'
|
||||
_VALID_URL = BBCCoUkIPlayerPlaylistBaseIE._VALID_URL_TMPL % 'episodes'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.bbc.co.uk/iplayer/episodes/b05rcz9v',
|
||||
'info_dict': {
|
||||
'id': 'b05rcz9v',
|
||||
'title': 'The Disappearance',
|
||||
'description': 'French thriller serial about a missing teenager.',
|
||||
'description': 'md5:58eb101aee3116bad4da05f91179c0cb',
|
||||
},
|
||||
'playlist_mincount': 6,
|
||||
'skip': 'This programme is not currently available on BBC iPlayer',
|
||||
'playlist_mincount': 8,
|
||||
}, {
|
||||
# all seasons
|
||||
'url': 'https://www.bbc.co.uk/iplayer/episodes/b094m5t9/doctor-foster',
|
||||
'info_dict': {
|
||||
'id': 'b094m5t9',
|
||||
'title': 'Doctor Foster',
|
||||
'description': 'md5:5aa9195fad900e8e14b52acd765a9fd6',
|
||||
},
|
||||
'playlist_mincount': 10,
|
||||
}, {
|
||||
# explicit season
|
||||
'url': 'https://www.bbc.co.uk/iplayer/episodes/b094m5t9/doctor-foster?seriesId=b094m6nv',
|
||||
'info_dict': {
|
||||
'id': 'b094m5t9',
|
||||
'title': 'Doctor Foster',
|
||||
'description': 'md5:5aa9195fad900e8e14b52acd765a9fd6',
|
||||
},
|
||||
'playlist_mincount': 5,
|
||||
}, {
|
||||
# all pages
|
||||
'url': 'https://www.bbc.co.uk/iplayer/episodes/m0004c4v/beechgrove',
|
||||
'info_dict': {
|
||||
'id': 'm0004c4v',
|
||||
'title': 'Beechgrove',
|
||||
'description': 'Gardening show that celebrates Scottish horticulture and growing conditions.',
|
||||
},
|
||||
'playlist_mincount': 37,
|
||||
}, {
|
||||
# explicit page
|
||||
'url': 'https://www.bbc.co.uk/iplayer/episodes/m0004c4v/beechgrove?page=2',
|
||||
'info_dict': {
|
||||
'id': 'm0004c4v',
|
||||
'title': 'Beechgrove',
|
||||
'description': 'Gardening show that celebrates Scottish horticulture and growing conditions.',
|
||||
},
|
||||
'playlist_mincount': 1,
|
||||
}]
|
||||
_PAGE_SIZE = 100
|
||||
_DESCRIPTION_KEY = 'synopsis'
|
||||
|
||||
def _get_episode_image(self, episode):
|
||||
return self._get_default(episode, 'image')
|
||||
|
||||
def _get_episode_field(self, episode, field):
|
||||
return self._get_default(episode, field)
|
||||
|
||||
@staticmethod
|
||||
def _get_elements(data):
|
||||
return data['entities']['results']
|
||||
|
||||
@staticmethod
|
||||
def _get_episode(element):
|
||||
return element.get('episode') or {}
|
||||
|
||||
def _call_api(self, pid, per_page, page=1, series_id=None):
|
||||
variables = {
|
||||
'id': pid,
|
||||
'page': page,
|
||||
'perPage': per_page,
|
||||
}
|
||||
if series_id:
|
||||
variables['sliceId'] = series_id
|
||||
return self._download_json(
|
||||
'https://graph.ibl.api.bbc.co.uk/', pid, headers={
|
||||
'Content-Type': 'application/json'
|
||||
}, data=json.dumps({
|
||||
'id': '5692d93d5aac8d796a0305e895e61551',
|
||||
'variables': variables,
|
||||
}).encode('utf-8'))['data']['programme']
|
||||
|
||||
@staticmethod
|
||||
def _get_playlist_data(data):
|
||||
return data
|
||||
|
||||
def _get_playlist_title(self, data):
|
||||
return self._get_default(data, 'title')
|
||||
|
||||
|
||||
class BBCCoUkIPlayerGroupIE(BBCCoUkIPlayerPlaylistBaseIE):
|
||||
IE_NAME = 'bbc.co.uk:iplayer:group'
|
||||
_VALID_URL = BBCCoUkIPlayerPlaylistBaseIE._VALID_URL_TMPL % 'group'
|
||||
_TESTS = [{
|
||||
# Available for over a year unlike 30 days for most other programmes
|
||||
'url': 'http://www.bbc.co.uk/iplayer/group/p02tcc32',
|
||||
'info_dict': {
|
||||
@@ -1361,14 +1494,56 @@ class BBCCoUkIPlayerPlaylistIE(BBCCoUkPlaylistBaseIE):
|
||||
'description': 'md5:683e901041b2fe9ba596f2ab04c4dbe7',
|
||||
},
|
||||
'playlist_mincount': 10,
|
||||
}, {
|
||||
# all pages
|
||||
'url': 'https://www.bbc.co.uk/iplayer/group/p081d7j7',
|
||||
'info_dict': {
|
||||
'id': 'p081d7j7',
|
||||
'title': 'Music in Scotland',
|
||||
'description': 'Perfomances in Scotland and programmes featuring Scottish acts.',
|
||||
},
|
||||
'playlist_mincount': 47,
|
||||
}, {
|
||||
# explicit page
|
||||
'url': 'https://www.bbc.co.uk/iplayer/group/p081d7j7?page=2',
|
||||
'info_dict': {
|
||||
'id': 'p081d7j7',
|
||||
'title': 'Music in Scotland',
|
||||
'description': 'Perfomances in Scotland and programmes featuring Scottish acts.',
|
||||
},
|
||||
'playlist_mincount': 11,
|
||||
}]
|
||||
_PAGE_SIZE = 200
|
||||
_DESCRIPTION_KEY = 'synopses'
|
||||
|
||||
def _extract_title_and_description(self, webpage):
|
||||
title = self._search_regex(r'<h1>([^<]+)</h1>', webpage, 'title', fatal=False)
|
||||
description = self._search_regex(
|
||||
r'<p[^>]+class=(["\'])subtitle\1[^>]*>(?P<value>[^<]+)</p>',
|
||||
webpage, 'description', fatal=False, group='value')
|
||||
return title, description
|
||||
def _get_episode_image(self, episode):
|
||||
return self._get_default(episode, 'images', 'standard')
|
||||
|
||||
def _get_episode_field(self, episode, field):
|
||||
return episode.get(field)
|
||||
|
||||
@staticmethod
|
||||
def _get_elements(data):
|
||||
return data['elements']
|
||||
|
||||
@staticmethod
|
||||
def _get_episode(element):
|
||||
return element
|
||||
|
||||
def _call_api(self, pid, per_page, page=1, series_id=None):
|
||||
return self._download_json(
|
||||
'http://ibl.api.bbc.co.uk/ibl/v1/groups/%s/episodes' % pid,
|
||||
pid, query={
|
||||
'page': page,
|
||||
'per_page': per_page,
|
||||
})['group_episodes']
|
||||
|
||||
@staticmethod
|
||||
def _get_playlist_data(data):
|
||||
return data['group']
|
||||
|
||||
def _get_playlist_title(self, data):
|
||||
return data.get('title')
|
||||
|
||||
|
||||
class BBCCoUkPlaylistIE(BBCCoUkPlaylistBaseIE):
|
||||
|
||||
@@ -108,7 +108,8 @@ from .bandcamp import BandcampIE, BandcampAlbumIE, BandcampWeeklyIE
|
||||
from .bbc import (
|
||||
BBCCoUkIE,
|
||||
BBCCoUkArticleIE,
|
||||
BBCCoUkIPlayerPlaylistIE,
|
||||
BBCCoUkIPlayerEpisodesIE,
|
||||
BBCCoUkIPlayerGroupIE,
|
||||
BBCCoUkPlaylistIE,
|
||||
BBCIE,
|
||||
)
|
||||
@@ -1673,9 +1674,14 @@ from .zattoo import (
|
||||
ZattooLiveIE,
|
||||
)
|
||||
from .zdf import ZDFIE, ZDFChannelIE
|
||||
from .zee5 import (
|
||||
Zee5IE,
|
||||
Zee5SeriesIE,
|
||||
)
|
||||
from .zhihu import ZhihuIE
|
||||
from .zingmp3 import ZingMp3IE
|
||||
from .zee5 import Zee5IE
|
||||
from .zee5 import Zee5SeriesIE
|
||||
from .zingmp3 import (
|
||||
ZingMp3IE,
|
||||
ZingMp3AlbumIE,
|
||||
)
|
||||
from .zoom import ZoomIE
|
||||
from .zype import ZypeIE
|
||||
|
||||
@@ -2965,7 +2965,7 @@ class GenericIE(InfoExtractor):
|
||||
webpage)
|
||||
if not mobj:
|
||||
mobj = re.search(
|
||||
r'data-video-link=["\'](?P<url>http://m.mlb.com/video/[^"\']+)',
|
||||
r'data-video-link=["\'](?P<url>http://m\.mlb\.com/video/[^"\']+)',
|
||||
webpage)
|
||||
if mobj is not None:
|
||||
return self.url_result(mobj.group('url'), 'MLB')
|
||||
|
||||
@@ -112,7 +112,7 @@ class LinuxAcademyIE(InfoExtractor):
|
||||
'client_id': self._CLIENT_ID,
|
||||
'redirect_uri': self._ORIGIN_URL,
|
||||
'tenant': 'lacausers',
|
||||
'connection': 'Username-Password-Authentication',
|
||||
'connection': 'Username-Password-ACG-Proxy',
|
||||
'username': username,
|
||||
'password': password,
|
||||
'sso': 'true',
|
||||
|
||||
@@ -340,7 +340,7 @@ class MTVServicesEmbeddedIE(MTVServicesInfoExtractor):
|
||||
@staticmethod
|
||||
def _extract_url(webpage):
|
||||
mobj = re.search(
|
||||
r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//media.mtvnservices.com/embed/.+?)\1', webpage)
|
||||
r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//media\.mtvnservices\.com/embed/.+?)\1', webpage)
|
||||
if mobj:
|
||||
return mobj.group('url')
|
||||
|
||||
|
||||
@@ -492,13 +492,12 @@ class NiconicoIE(InfoExtractor):
|
||||
self._sort_formats(formats)
|
||||
|
||||
# Start extracting information
|
||||
title = get_video_info_web('originalTitle')
|
||||
if not title:
|
||||
title = self._og_search_title(webpage, default=None)
|
||||
if not title:
|
||||
title = self._html_search_regex(
|
||||
title = (
|
||||
get_video_info_web(['originalTitle', 'title'])
|
||||
or self._og_search_title(webpage, default=None)
|
||||
or self._html_search_regex(
|
||||
r'<span[^>]+class="videoHeaderTitle"[^>]*>([^<]+)</span>',
|
||||
webpage, 'video title')
|
||||
webpage, 'video title'))
|
||||
|
||||
watch_api_data_string = self._html_search_regex(
|
||||
r'<div[^>]+id="watchAPIDataContainer"[^>]+>([^<]+)</div>',
|
||||
|
||||
@@ -143,7 +143,10 @@ class TikTokIE(TikTokBaseIE):
|
||||
props_data = try_get(json_data, lambda x: x['props'], expected_type=dict)
|
||||
|
||||
# Chech statusCode for success
|
||||
if props_data.get('pageProps').get('statusCode') == 0:
|
||||
status = props_data.get('pageProps').get('statusCode')
|
||||
if status == 0:
|
||||
return self._extract_aweme(props_data, webpage, url)
|
||||
elif status == 10216:
|
||||
raise ExtractorError('This video is private', expected=True)
|
||||
|
||||
raise ExtractorError('Video not available', video_id=video_id)
|
||||
|
||||
@@ -23,6 +23,8 @@ class VGTVIE(XstreamIE):
|
||||
'fvn.no/fvntv': 'fvntv',
|
||||
'aftenposten.no/webtv': 'aptv',
|
||||
'ap.vgtv.no/webtv': 'aptv',
|
||||
'tv.aftonbladet.se': 'abtv',
|
||||
# obsolete URL schemas, kept in order to save one HTTP redirect
|
||||
'tv.aftonbladet.se/abtv': 'abtv',
|
||||
'www.aftonbladet.se/tv': 'abtv',
|
||||
}
|
||||
@@ -140,6 +142,10 @@ class VGTVIE(XstreamIE):
|
||||
'url': 'http://www.vgtv.no/#!/video/127205/inside-the-mind-of-favela-funk',
|
||||
'only_matching': True,
|
||||
},
|
||||
{
|
||||
'url': 'https://tv.aftonbladet.se/video/36015/vulkanutbrott-i-rymden-nu-slapper-nasa-bilderna',
|
||||
'only_matching': True,
|
||||
},
|
||||
{
|
||||
'url': 'http://tv.aftonbladet.se/abtv/articles/36015',
|
||||
'only_matching': True,
|
||||
|
||||
@@ -1947,7 +1947,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
f['format_id'] = itag
|
||||
formats.append(f)
|
||||
|
||||
if self._downloader.params.get('youtube_include_dash_manifest'):
|
||||
if self._downloader.params.get('youtube_include_dash_manifest', True):
|
||||
dash_manifest_url = streaming_data.get('dashManifestUrl')
|
||||
if dash_manifest_url:
|
||||
for f in self._extract_mpd_formats(
|
||||
@@ -2150,6 +2150,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||
# This will error if there is no livechat
|
||||
initial_data['contents']['twoColumnWatchNextResults']['conversationBar']['liveChatRenderer']['continuations'][0]['reloadContinuationData']['continuation']
|
||||
info['subtitles']['live_chat'] = [{
|
||||
'url': 'https://www.youtube.com/watch?v=%s' % video_id, # url is needed to set cookies
|
||||
'video_id': video_id,
|
||||
'ext': 'json',
|
||||
'protocol': 'youtube_live_chat_replay',
|
||||
|
||||
@@ -1,93 +1,94 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
update_url_query,
|
||||
)
|
||||
|
||||
|
||||
class ZingMp3BaseInfoExtractor(InfoExtractor):
|
||||
class ZingMp3BaseIE(InfoExtractor):
|
||||
_VALID_URL_TMPL = r'https?://(?:mp3\.zing|zingmp3)\.vn/(?:%s)/[^/]+/(?P<id>\w+)\.html'
|
||||
_GEO_COUNTRIES = ['VN']
|
||||
|
||||
def _extract_item(self, item, page_type, fatal=True):
|
||||
error_message = item.get('msg')
|
||||
if error_message:
|
||||
if not fatal:
|
||||
return
|
||||
raise ExtractorError(
|
||||
'%s returned error: %s' % (self.IE_NAME, error_message),
|
||||
expected=True)
|
||||
def _extract_item(self, item, fatal):
|
||||
item_id = item['id']
|
||||
title = item.get('name') or item['title']
|
||||
|
||||
formats = []
|
||||
for quality, source_url in zip(item.get('qualities') or item.get('quality', []), item.get('source_list') or item.get('source', [])):
|
||||
if not source_url or source_url == 'require vip':
|
||||
for k, v in (item.get('source') or {}).items():
|
||||
if not v:
|
||||
continue
|
||||
if not re.match(r'https?://', source_url):
|
||||
source_url = '//' + source_url
|
||||
source_url = self._proto_relative_url(source_url, 'http:')
|
||||
quality_num = int_or_none(quality)
|
||||
f = {
|
||||
'format_id': quality,
|
||||
'url': source_url,
|
||||
}
|
||||
if page_type == 'video':
|
||||
f.update({
|
||||
'height': quality_num,
|
||||
'ext': 'mp4',
|
||||
})
|
||||
if k in ('mp4', 'hls'):
|
||||
for res, video_url in v.items():
|
||||
if not video_url:
|
||||
continue
|
||||
if k == 'hls':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
video_url, item_id, 'mp4',
|
||||
'm3u8_native', m3u8_id=k, fatal=False))
|
||||
elif k == 'mp4':
|
||||
formats.append({
|
||||
'format_id': 'mp4-' + res,
|
||||
'url': video_url,
|
||||
'height': int_or_none(self._search_regex(
|
||||
r'^(\d+)p', res, 'resolution', default=None)),
|
||||
})
|
||||
else:
|
||||
f.update({
|
||||
'abr': quality_num,
|
||||
formats.append({
|
||||
'ext': 'mp3',
|
||||
'format_id': k,
|
||||
'tbr': int_or_none(k),
|
||||
'url': self._proto_relative_url(v),
|
||||
'vcodec': 'none',
|
||||
})
|
||||
formats.append(f)
|
||||
if not formats:
|
||||
if not fatal:
|
||||
return
|
||||
msg = item['msg']
|
||||
if msg == 'Sorry, this content is not available in your country.':
|
||||
self.raise_geo_restricted(countries=self._GEO_COUNTRIES)
|
||||
raise ExtractorError(msg, expected=True)
|
||||
self._sort_formats(formats)
|
||||
|
||||
cover = item.get('cover')
|
||||
subtitles = None
|
||||
lyric = item.get('lyric')
|
||||
if lyric:
|
||||
subtitles = {
|
||||
'origin': [{
|
||||
'url': lyric,
|
||||
}],
|
||||
}
|
||||
|
||||
album = item.get('album') or {}
|
||||
|
||||
return {
|
||||
'title': (item.get('name') or item.get('title')).strip(),
|
||||
'id': item_id,
|
||||
'title': title,
|
||||
'formats': formats,
|
||||
'thumbnail': 'http:/' + cover if cover else None,
|
||||
'artist': item.get('artist'),
|
||||
'thumbnail': item.get('thumbnail'),
|
||||
'subtitles': subtitles,
|
||||
'duration': int_or_none(item.get('duration')),
|
||||
'track': title,
|
||||
'artist': item.get('artists_names'),
|
||||
'album': album.get('name') or album.get('title'),
|
||||
'album_artist': album.get('artists_names'),
|
||||
}
|
||||
|
||||
def _extract_player_json(self, player_json_url, id, page_type, playlist_title=None):
|
||||
player_json = self._download_json(player_json_url, id, 'Downloading Player JSON')
|
||||
items = player_json['data']
|
||||
if 'item' in items:
|
||||
items = items['item']
|
||||
|
||||
if len(items) == 1:
|
||||
# one single song
|
||||
data = self._extract_item(items[0], page_type)
|
||||
data['id'] = id
|
||||
|
||||
return data
|
||||
else:
|
||||
# playlist of songs
|
||||
entries = []
|
||||
|
||||
for i, item in enumerate(items, 1):
|
||||
entry = self._extract_item(item, page_type, fatal=False)
|
||||
if not entry:
|
||||
continue
|
||||
entry['id'] = '%s-%d' % (id, i)
|
||||
entries.append(entry)
|
||||
|
||||
return {
|
||||
'_type': 'playlist',
|
||||
'id': id,
|
||||
'title': playlist_title,
|
||||
'entries': entries,
|
||||
}
|
||||
def _real_extract(self, url):
|
||||
page_id = self._match_id(url)
|
||||
webpage = self._download_webpage(
|
||||
url.replace('://zingmp3.vn/', '://mp3.zing.vn/'),
|
||||
page_id, query={'play_song': 1})
|
||||
data_path = self._search_regex(
|
||||
r'data-xml="([^"]+)', webpage, 'data path')
|
||||
return self._process_data(self._download_json(
|
||||
'https://mp3.zing.vn/xhr' + data_path, page_id)['data'])
|
||||
|
||||
|
||||
class ZingMp3IE(ZingMp3BaseInfoExtractor):
|
||||
_VALID_URL = r'https?://mp3\.zing\.vn/(?:bai-hat|album|playlist|video-clip)/[^/]+/(?P<id>\w+)\.html'
|
||||
class ZingMp3IE(ZingMp3BaseIE):
|
||||
_VALID_URL = ZingMp3BaseIE._VALID_URL_TMPL % 'bai-hat|video-clip'
|
||||
_TESTS = [{
|
||||
'url': 'http://mp3.zing.vn/bai-hat/Xa-Mai-Xa-Bao-Thy/ZWZB9WAB.html',
|
||||
'md5': 'ead7ae13693b3205cbc89536a077daed',
|
||||
@@ -95,49 +96,66 @@ class ZingMp3IE(ZingMp3BaseInfoExtractor):
|
||||
'id': 'ZWZB9WAB',
|
||||
'title': 'Xa Mãi Xa',
|
||||
'ext': 'mp3',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'thumbnail': r're:^https?://.+\.jpg',
|
||||
'subtitles': {
|
||||
'origin': [{
|
||||
'ext': 'lrc',
|
||||
}]
|
||||
},
|
||||
'duration': 255,
|
||||
'track': 'Xa Mãi Xa',
|
||||
'artist': 'Bảo Thy',
|
||||
'album': 'Special Album',
|
||||
'album_artist': 'Bảo Thy',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://mp3.zing.vn/video-clip/Let-It-Go-Frozen-OST-Sungha-Jung/ZW6BAEA0.html',
|
||||
'md5': '870295a9cd8045c0e15663565902618d',
|
||||
'url': 'https://mp3.zing.vn/video-clip/Suong-Hoa-Dua-Loi-K-ICM-RYO/ZO8ZF7C7.html',
|
||||
'md5': 'e9c972b693aa88301ef981c8151c4343',
|
||||
'info_dict': {
|
||||
'id': 'ZW6BAEA0',
|
||||
'title': 'Let It Go (Frozen OST)',
|
||||
'id': 'ZO8ZF7C7',
|
||||
'title': 'Sương Hoa Đưa Lối',
|
||||
'ext': 'mp4',
|
||||
'thumbnail': r're:^https?://.+\.jpg',
|
||||
'duration': 207,
|
||||
'track': 'Sương Hoa Đưa Lối',
|
||||
'artist': 'K-ICM, RYO',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://mp3.zing.vn/album/Lau-Dai-Tinh-Ai-Bang-Kieu-Minh-Tuyet/ZWZBWDAF.html',
|
||||
'info_dict': {
|
||||
'_type': 'playlist',
|
||||
'id': 'ZWZBWDAF',
|
||||
'title': 'Lâu Đài Tình Ái - Bằng Kiều,Minh Tuyết | Album 320 lossless',
|
||||
},
|
||||
'playlist_count': 10,
|
||||
'skip': 'removed at the request of the owner',
|
||||
}, {
|
||||
'url': 'http://mp3.zing.vn/playlist/Duong-Hong-Loan-apollobee/IWCAACCB.html',
|
||||
'url': 'https://zingmp3.vn/bai-hat/Xa-Mai-Xa-Bao-Thy/ZWZB9WAB.html',
|
||||
'only_matching': True,
|
||||
}]
|
||||
IE_NAME = 'zingmp3'
|
||||
IE_DESC = 'mp3.zing.vn'
|
||||
|
||||
def _real_extract(self, url):
|
||||
page_id = self._match_id(url)
|
||||
def _process_data(self, data):
|
||||
return self._extract_item(data, True)
|
||||
|
||||
webpage = self._download_webpage(url, page_id)
|
||||
|
||||
player_json_url = self._search_regex([
|
||||
r'data-xml="([^"]+)',
|
||||
r'&xmlURL=([^&]+)&'
|
||||
], webpage, 'player xml url')
|
||||
class ZingMp3AlbumIE(ZingMp3BaseIE):
|
||||
_VALID_URL = ZingMp3BaseIE._VALID_URL_TMPL % 'album|playlist'
|
||||
_TESTS = [{
|
||||
'url': 'http://mp3.zing.vn/album/Lau-Dai-Tinh-Ai-Bang-Kieu-Minh-Tuyet/ZWZBWDAF.html',
|
||||
'info_dict': {
|
||||
'_type': 'playlist',
|
||||
'id': 'ZWZBWDAF',
|
||||
'title': 'Lâu Đài Tình Ái',
|
||||
},
|
||||
'playlist_count': 10,
|
||||
}, {
|
||||
'url': 'http://mp3.zing.vn/playlist/Duong-Hong-Loan-apollobee/IWCAACCB.html',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://zingmp3.vn/album/Lau-Dai-Tinh-Ai-Bang-Kieu-Minh-Tuyet/ZWZBWDAF.html',
|
||||
'only_matching': True,
|
||||
}]
|
||||
IE_NAME = 'zingmp3:album'
|
||||
|
||||
playlist_title = None
|
||||
page_type = self._search_regex(r'/(?:html5)?xml/([^/-]+)', player_json_url, 'page type')
|
||||
if page_type == 'video':
|
||||
player_json_url = update_url_query(player_json_url, {'format': 'json'})
|
||||
else:
|
||||
player_json_url = player_json_url.replace('/xml/', '/html5xml/')
|
||||
if page_type == 'album':
|
||||
playlist_title = self._og_search_title(webpage)
|
||||
|
||||
return self._extract_player_json(player_json_url, page_id, page_type, playlist_title)
|
||||
def _process_data(self, data):
|
||||
def entries():
|
||||
for item in (data.get('items') or []):
|
||||
entry = self._extract_item(item, False)
|
||||
if entry:
|
||||
yield entry
|
||||
info = data.get('info') or {}
|
||||
return self.playlist_result(
|
||||
entries(), info.get('id'), info.get('name') or info.get('title'))
|
||||
|
||||
@@ -1,82 +1,68 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
int_or_none,
|
||||
url_or_none,
|
||||
js_to_json,
|
||||
parse_filesize,
|
||||
urlencode_postdata
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
|
||||
class ZoomIE(InfoExtractor):
|
||||
IE_NAME = 'zoom'
|
||||
_VALID_URL = r'https://(?:.*).?zoom.us/rec(?:ording)?/(play|share)/(?P<id>[A-Za-z0-9\-_.]+)'
|
||||
|
||||
_VALID_URL = r'(?P<base_url>https?://(?:[^.]+\.)?zoom.us/)rec(?:ording)?/(?:play|share)/(?P<id>[A-Za-z0-9_.-]+)'
|
||||
_TEST = {
|
||||
'url': 'https://zoom.us/recording/play/SILVuCL4bFtRwWTtOCFQQxAsBQsJljFtm9e4Z_bvo-A8B-nzUSYZRNuPl3qW5IGK',
|
||||
'url': 'https://economist.zoom.us/rec/play/dUk_CNBETmZ5VA2BwEl-jjakPpJ3M1pcfVYAPRsoIbEByGsLjUZtaa4yCATQuOL3der8BlTwxQePl_j0.EImBkXzTIaPvdZO5',
|
||||
'md5': 'ab445e8c911fddc4f9adc842c2c5d434',
|
||||
'info_dict': {
|
||||
'md5': '031a5b379f1547a8b29c5c4c837dccf2',
|
||||
'title': "GAZ Transformational Tuesdays W/ Landon & Stapes",
|
||||
'id': "SILVuCL4bFtRwWTtOCFQQxAsBQsJljFtm9e4Z_bvo-A8B-nzUSYZRNuPl3qW5IGK",
|
||||
'ext': "mp4"
|
||||
'id': 'dUk_CNBETmZ5VA2BwEl-jjakPpJ3M1pcfVYAPRsoIbEByGsLjUZtaa4yCATQuOL3der8BlTwxQePl_j0.EImBkXzTIaPvdZO5',
|
||||
'ext': 'mp4',
|
||||
'title': 'China\'s "two sessions" and the new five-year plan',
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
base_url, play_id = re.match(self._VALID_URL, url).groups()
|
||||
webpage = self._download_webpage(url, play_id)
|
||||
|
||||
password_protected = self._search_regex(r'<form[^>]+?id="(password_form)"', webpage, 'password field', fatal=False, default=None)
|
||||
if password_protected is not None:
|
||||
self._verify_video_password(url, display_id, webpage)
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
try:
|
||||
form = self._form_hidden_inputs('password_form', webpage)
|
||||
except ExtractorError:
|
||||
form = None
|
||||
if form:
|
||||
password = self._downloader.params.get('videopassword')
|
||||
if not password:
|
||||
raise ExtractorError(
|
||||
'This video is protected by a passcode, use the --video-password option', expected=True)
|
||||
is_meeting = form.get('useWhichPasswd') == 'meeting'
|
||||
validation = self._download_json(
|
||||
base_url + 'rec/validate%s_passwd' % ('_meet' if is_meeting else ''),
|
||||
play_id, 'Validating passcode', 'Wrong passcode', data=urlencode_postdata({
|
||||
'id': form[('meet' if is_meeting else 'file') + 'Id'],
|
||||
'passwd': password,
|
||||
'action': form.get('action'),
|
||||
}))
|
||||
if not validation.get('status'):
|
||||
raise ExtractorError(validation['errorMessage'], expected=True)
|
||||
webpage = self._download_webpage(url, play_id)
|
||||
|
||||
video_url = self._search_regex(r"viewMp4Url: \'(.*)\'", webpage, 'video url')
|
||||
title = self._html_search_regex([r"topic: \"(.*)\",", r"<title>(.*) - Zoom</title>"], webpage, 'title')
|
||||
viewResolvtionsWidth = self._search_regex(r"viewResolvtionsWidth: (\d*)", webpage, 'res width', fatal=False)
|
||||
viewResolvtionsHeight = self._search_regex(r"viewResolvtionsHeight: (\d*)", webpage, 'res height', fatal=False)
|
||||
fileSize = parse_filesize(self._search_regex(r"fileSize: \'(.+)\'", webpage, 'fileSize', fatal=False))
|
||||
|
||||
urlprefix = url.split("zoom.us")[0] + "zoom.us/"
|
||||
|
||||
formats = []
|
||||
formats.append({
|
||||
'url': url_or_none(video_url),
|
||||
'width': int_or_none(viewResolvtionsWidth),
|
||||
'height': int_or_none(viewResolvtionsHeight),
|
||||
'http_headers': {'Accept': 'video/webm,video/ogg,video/*;q=0.9,application/ogg;q=0.7,audio/*;q=0.6,*/*;q=0.5',
|
||||
'Referer': urlprefix},
|
||||
'ext': "mp4",
|
||||
'filesize_approx': int_or_none(fileSize)
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
data = self._parse_json(self._search_regex(
|
||||
r'(?s)window\.__data__\s*=\s*({.+?});',
|
||||
webpage, 'data'), play_id, js_to_json)
|
||||
|
||||
return {
|
||||
'id': display_id,
|
||||
'title': title,
|
||||
'formats': formats
|
||||
'id': play_id,
|
||||
'title': data['topic'],
|
||||
'url': data['viewMp4Url'],
|
||||
'width': int_or_none(data.get('viewResolvtionsWidth')),
|
||||
'height': int_or_none(data.get('viewResolvtionsHeight')),
|
||||
'http_headers': {
|
||||
'Referer': base_url,
|
||||
},
|
||||
'filesize_approx': parse_filesize(data.get('fileSize')),
|
||||
}
|
||||
|
||||
def _verify_video_password(self, url, video_id, webpage):
|
||||
password = self._downloader.params.get('videopassword')
|
||||
if password is None:
|
||||
raise ExtractorError('This video is protected by a password, use the --video-password option', expected=True)
|
||||
meetId = self._search_regex(r'<input[^>]+?id="meetId" value="([^\"]+)"', webpage, 'meetId')
|
||||
data = urlencode_postdata({
|
||||
'id': meetId,
|
||||
'passwd': password,
|
||||
'action': "viewdetailedpage",
|
||||
'recaptcha': ""
|
||||
})
|
||||
validation_url = url.split("zoom.us")[0] + "zoom.us/rec/validate_meet_passwd"
|
||||
validation_response = self._download_json(
|
||||
validation_url, video_id,
|
||||
note='Validating Password...',
|
||||
errnote='Wrong password?',
|
||||
data=data)
|
||||
|
||||
if validation_response['errorCode'] != 0:
|
||||
raise ExtractorError('Login failed, %s said: %r' % (self.IE_NAME, validation_response['errorMessage']))
|
||||
|
||||
@@ -1147,13 +1147,18 @@ def parseOpts(overrideArguments=None):
|
||||
metavar='FIELD:FORMAT', dest='metafromfield', action='append',
|
||||
help=(
|
||||
'Parse additional metadata like title/artist from other fields. '
|
||||
'Give field name to extract data from, and format of the field seperated by a ":". '
|
||||
'Give a template or field name to extract data from and the '
|
||||
'format to interpret it as, seperated by a ":". '
|
||||
'Either regular expression with named capture groups or a '
|
||||
'similar syntax to the output template can also be used. '
|
||||
'The parsed parameters replace any existing values and can be use in output template. '
|
||||
'similar syntax to the output template can be used for the FORMAT. '
|
||||
'Similarly, the syntax for output template can be used for FIELD '
|
||||
'to parse the data from multiple fields. '
|
||||
'The parsed parameters replace any existing values and can be used in output templates. '
|
||||
'This option can be used multiple times. '
|
||||
'Example: --parse-metadata "title:%(artist)s - %(title)s" matches a title like '
|
||||
'"Coldplay - Paradise". '
|
||||
'Example: --parse-metadata "%(series)s %(episode_number)s:%(title)s" '
|
||||
'sets the title using series and episode number. '
|
||||
'Example (regex): --parse-metadata "description:Artist - (?P<artist>.+?)"'))
|
||||
postproc.add_option(
|
||||
'--xattrs',
|
||||
|
||||
@@ -4,11 +4,10 @@ import re
|
||||
|
||||
from .common import PostProcessor
|
||||
from ..compat import compat_str
|
||||
from ..utils import str_or_none
|
||||
|
||||
|
||||
class MetadataFromFieldPP(PostProcessor):
|
||||
regex = r'(?P<field>\w+):(?P<format>.+)$'
|
||||
regex = r'(?P<in>.+):(?P<out>.+)$'
|
||||
|
||||
def __init__(self, downloader, formats):
|
||||
PostProcessor.__init__(self, downloader)
|
||||
@@ -19,11 +18,20 @@ class MetadataFromFieldPP(PostProcessor):
|
||||
match = re.match(self.regex, f)
|
||||
assert match is not None
|
||||
self._data.append({
|
||||
'field': match.group('field'),
|
||||
'format': match.group('format'),
|
||||
'regex': self.format_to_regex(match.group('format'))})
|
||||
'in': match.group('in'),
|
||||
'out': match.group('out'),
|
||||
'tmpl': self.field_to_template(match.group('in')),
|
||||
'regex': self.format_to_regex(match.group('out')),
|
||||
})
|
||||
|
||||
def format_to_regex(self, fmt):
|
||||
@staticmethod
|
||||
def field_to_template(tmpl):
|
||||
if re.match(r'\w+$', tmpl):
|
||||
return '%%(%s)s' % tmpl
|
||||
return tmpl
|
||||
|
||||
@staticmethod
|
||||
def format_to_regex(fmt):
|
||||
r"""
|
||||
Converts a string like
|
||||
'%(title)s - %(artist)s'
|
||||
@@ -37,7 +45,7 @@ class MetadataFromFieldPP(PostProcessor):
|
||||
# replace %(..)s with regex group and escape other string parts
|
||||
for match in re.finditer(r'%\((\w+)\)s', fmt):
|
||||
regex += re.escape(fmt[lastpos:match.start()])
|
||||
regex += r'(?P<' + match.group(1) + r'>[^\r\n]+)'
|
||||
regex += r'(?P<%s>[^\r\n]+)' % match.group(1)
|
||||
lastpos = match.end()
|
||||
if lastpos < len(fmt):
|
||||
regex += re.escape(fmt[lastpos:])
|
||||
@@ -45,22 +53,16 @@ class MetadataFromFieldPP(PostProcessor):
|
||||
|
||||
def run(self, info):
|
||||
for dictn in self._data:
|
||||
field, regex = dictn['field'], dictn['regex']
|
||||
if field not in info:
|
||||
self.report_warning('Video doesnot have a %s' % field)
|
||||
continue
|
||||
data_to_parse = str_or_none(info[field])
|
||||
if data_to_parse is None:
|
||||
self.report_warning('Field %s cannot be parsed' % field)
|
||||
continue
|
||||
self.write_debug('Searching for r"%s" in %s' % (regex, field))
|
||||
match = re.search(regex, data_to_parse)
|
||||
tmpl, info_copy = self._downloader.prepare_outtmpl(dictn['tmpl'], info)
|
||||
data_to_parse = tmpl % info_copy
|
||||
self.write_debug('Searching for r"%s" in %s' % (dictn['regex'], tmpl))
|
||||
match = re.search(dictn['regex'], data_to_parse)
|
||||
if match is None:
|
||||
self.report_warning('Could not interpret video %s as "%s"' % (field, dictn['format']))
|
||||
self.report_warning('Could not interpret video %s as "%s"' % (dictn['in'], dictn['out']))
|
||||
continue
|
||||
for attribute, value in match.groupdict().items():
|
||||
info[attribute] = value
|
||||
self.to_screen('parsed %s from %s: %s' % (attribute, field, value if value is not None else 'NA'))
|
||||
self.to_screen('parsed %s from "%s": %s' % (attribute, dictn['in'], value if value is not None else 'NA'))
|
||||
return [], info
|
||||
|
||||
|
||||
|
||||
@@ -2423,6 +2423,15 @@ class DownloadError(YoutubeDLError):
|
||||
self.exc_info = exc_info
|
||||
|
||||
|
||||
class EntryNotInPlaylist(YoutubeDLError):
|
||||
"""Entry not in playlist exception.
|
||||
|
||||
This exception will be thrown by YoutubeDL when a requested entry
|
||||
is not found in the playlist info_dict
|
||||
"""
|
||||
pass
|
||||
|
||||
|
||||
class SameFileError(YoutubeDLError):
|
||||
"""Same File exception.
|
||||
|
||||
@@ -4196,6 +4205,20 @@ OUTTMPL_TYPES = {
|
||||
'pl_infojson': 'info.json',
|
||||
}
|
||||
|
||||
# As of [1] format syntax is:
|
||||
# %[mapping_key][conversion_flags][minimum_width][.precision][length_modifier]type
|
||||
# 1. https://docs.python.org/2/library/stdtypes.html#string-formatting
|
||||
FORMAT_RE = r'''(?x)
|
||||
(?<!%)
|
||||
%
|
||||
\({0}\) # mapping key
|
||||
(?:[#0\-+ ]+)? # conversion flags (optional)
|
||||
(?:\d+)? # minimum field width (optional)
|
||||
(?:\.\d+)? # precision (optional)
|
||||
[hlL]? # length modifier (optional)
|
||||
(?P<type>[diouxXeEfFgGcrs%]) # conversion type
|
||||
'''
|
||||
|
||||
|
||||
def limit_length(s, length):
|
||||
""" Add ellipses to overly long strings """
|
||||
|
||||
@@ -1,3 +1,3 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
__version__ = '2021.03.15'
|
||||
__version__ = '2021.03.21'
|
||||
|
||||
Reference in New Issue
Block a user