124 Commits

Author SHA1 Message Date
AaronBenDaniel 3f5cd7985d fix(api): Carry over paragraph styling (#51) 2025-04-12 23:22:14 +05:30
AaronBenDaniel 52faaf54c2 fix(api): Stopped extra newlines from appearing with bold/italics (#40) 2025-04-04 07:46:05 +05:30
AaronBenDaniel e79453ab5f fix(api): Strip control characters from chapter titles (#47) 2025-02-15 16:34:28 +05:30
AaronBenDaniel d30f15a254 fix(api): Remove unused class (#43 - @AaronBenDaniel)
* fix(api): Typo fix

* fix(api): Remove chapitre/chapter class
2025-01-24 17:20:21 +05:30
AaronBenDaniel 08ff95d686 fix(api): Fix PDF ToC page number formatting 2025-01-01 06:07:36 +05:30
TheOnlyWayUp 222e07085f fix(api): PDF ToC is h1
Fixes #36
2024-12-26 15:05:57 +00:00
TheOnlyWayUp 5cfdc8f305 fix(api): PDF renders About the Author page
Fixes #35
2024-12-26 14:45:33 +00:00
TheOnlyWayUp 0ce8ab1943 fix(dockerfile): Remove unneeded dependencies 2024-12-22 13:37:19 +00:00
TheOnlyWayUp ef6430a5cf fix(dockerfile): Pull apt-fast from existing image 2024-12-22 12:16:43 +00:00
TheOnlyWayUp d6c31da507 fix(dockerfile): Install add-apt-repository 2024-12-22 11:37:30 +00:00
TheOnlyWayUp 84c08b1c38 Add launch.json for easy debugging 2024-12-22 11:23:33 +00:00
TheOnlyWayUp 352faceb7a fix(api): Use lxml parser for PDF Template 2024-12-22 11:23:19 +00:00
TheOnlyWayUp d2063f59be feat(dockerfile): Use apt-fast 2024-12-22 11:18:36 +00:00
TheOnlyWayUp 8df2b6e84e fix(dockerfile): Remove wkhtmltopdf 2024-12-22 11:11:21 +00:00
TheOnlyWayUp 3816ab3fd0 frontend: Add faster downloads to Changelog 2024-12-22 11:00:54 +00:00
TheOnlyWayUp 5215689836 Merge branch 'feature/#31-zip-downloading' into feature/#29-pdf-downloads 2024-12-22 10:58:28 +00:00
Aaron BenDaniel a1191b2600 feat(api): Make archive extraction asynchronous 2024-12-19 15:10:17 +00:00
AaronBenDaniel 82270dc770 feat(api): Download parts as .zip 2024-12-15 14:38:28 -05:00
TheOnlyWayUp 8dc7d16578 feat(api): Generate PDFs with Weasyprint! 2024-12-10 18:37:22 +00:00
TheOnlyWayUp f8ab318210 feat(api): Reconstruct tree from Content HTML, move PDF Template reads to Init 2024-12-10 18:36:23 +00:00
TheOnlyWayUp 758b14fd15 feat(api): Merge PDF Templates 2024-12-10 18:34:27 +00:00
TheOnlyWayUp 90139b190b fix(api): Remove wkhtmltopdf, add weasyprint to requirements 2024-12-10 15:56:29 +00:00
TheOnlyWayUp b4c3cfeffb Include samples in Readme 2024-12-10 15:42:27 +00:00
TheOnlyWayUp ff10b3c6c9 Add samples 2024-12-10 15:25:10 +00:00
TheOnlyWayUp a7a26dc2b6 feat(api): Improve link styling in PDFs 2024-12-10 15:19:28 +00:00
TheOnlyWayUp 016ad6209a fix(api): Use BytesIO when dumping generated book 2024-12-10 11:12:43 +00:00
TheOnlyWayUp 16c5a9216f fix(frontend): Downloading .htm files when error 2024-12-08 14:47:44 +00:00
TheOnlyWayUp 18799e5a91 fix(frontend): Update meta-tags to indicate PDF Downloads 2024-12-08 13:19:44 +00:00
TheOnlyWayUp c737f5314e fix(frontend): Open donate link in new tab 2024-12-08 13:17:02 +00:00
TheOnlyWayUp 097c37f24e fix(api): Error handlers accepting Request param 2024-12-08 13:14:42 +00:00
TheOnlyWayUp d924e1b6ce fix(frontend): Download PDFs instead of previewing them 2024-12-08 13:11:47 +00:00
TheOnlyWayUp 5d9eefd03c fix(api): EPUB Chapters had integer IDs, now bytes 2024-12-08 13:07:33 +00:00
TheOnlyWayUp e74642f6cb feat(dockerfile): Faster dependency installation with uv 2024-12-08 12:45:42 +00:00
TheOnlyWayUp ddb1862918 fix(api): Update requirements.txt 2024-12-08 12:42:08 +00:00
TheOnlyWayUp e4e372c664 fix(api): Fix dockerfile 2024-12-08 12:32:18 +00:00
TheOnlyWayUp 9747839ae9 fix(api): Add logging for PDF Generation 2024-12-08 12:04:53 +00:00
TheOnlyWayUp 7ef988ba42 fix(api): Add comments and docstrings 2024-12-08 11:51:16 +00:00
TheOnlyWayUp c51f32654c fix(dockerfile): Remove sudo 2024-12-08 11:25:35 +00:00
TheOnlyWayUp df58b59da2 fix(frontend): /donate resolves 2024-12-08 11:20:22 +00:00
TheOnlyWayUp 692f5ad82a api: Add donation URL to PDF 2024-12-08 11:13:04 +00:00
TheOnlyWayUp ca0b59cb0c fix(api): Move copyright before TOC, Merge copyright and cover pages, Move HTML to seperate files 2024-12-08 11:08:33 +00:00
TheOnlyWayUp 1948ed67ee feat(api): Include Copyright notice in PDFs! 2024-12-08 04:46:57 +00:00
TheOnlyWayUp e30bbf40b9 feat(api): PDF includes About the Author page! 2024-12-07 15:33:44 +00:00
TheOnlyWayUp 3f7591b15c feat(frontend): Prevent overflow on small screens 2024-12-07 13:41:29 +00:00
TheOnlyWayUp 174faafa0e api: Redirect to donate link 2024-12-07 10:31:00 +00:00
TheOnlyWayUp af026f1263 frontend: Add donate link 2024-12-07 10:29:47 +00:00
TheOnlyWayUp 4c43d01f64 feat(frontend): Support PDF Downloads! Update Changelog 2024-12-07 10:29:41 +00:00
TheOnlyWayUp fb42905b33 frontend: Add donate link 2024-12-07 10:28:40 +00:00
TheOnlyWayUp dd38369832 fix(api): Clean code 2024-12-07 10:00:49 +00:00
TheOnlyWayUp c116300272 fix(api): Symlink fonts to tmp as relative path are resolved from HTML tempfile path 2024-12-07 09:38:33 +00:00
TheOnlyWayUp 14dc13029a feat(api): Add Stylesheets for PDF 2024-12-07 09:37:37 +00:00
TheOnlyWayUp 5cbef75d19 fix(api): Close and delete temp files 2024-12-07 09:32:37 +00:00
TheOnlyWayUp 96d367da27 fix(api): Remove unnecessary newlines from Text 2024-12-07 09:31:00 +00:00
TheOnlyWayUp 7b521e492a feat(api): PDF Footers!
Image only chapters aren't packaged, that's a bug. Fixed a bug with the cover image not being included when download_images was False.
2024-12-07 06:36:34 +00:00
TheOnlyWayUp f0e7d79d2f gitignore: Ignore .html files 2024-12-07 06:02:39 +00:00
TheOnlyWayUp c6174aa418 api: Clean code 2024-12-07 06:02:27 +00:00
TheOnlyWayUp c33c773fe7 feat(api): Include high-res cover with EPUB and PDF Downloads 2024-12-07 05:59:17 +00:00
TheOnlyWayUp 8728b215ee feat(api): PDF Image downloads are functional! 2024-12-07 04:17:01 +00:00
TheOnlyWayUp 40bad57eac feat(api): PDF Downloads functional!
Image downloads borked
2024-12-06 15:32:26 +00:00
TheOnlyWayUp 6c6c8f81b6 feat(api): Add exiftool config for Completed and MatureContent metadata properties 2024-12-06 15:29:17 +00:00
TheOnlyWayUp 6bb63dd67b feat(api): Add exiftool to requirements 2024-12-06 15:28:51 +00:00
TheOnlyWayUp f9631a8f31 feat(api): Add exiftool to dockerfile
Untested, I probably have to install Perl as well
2024-12-06 15:28:32 +00:00
TheOnlyWayUp 12b022e780 fix(api): Download formats including Enum class name 2024-12-06 11:29:53 +00:00
TheOnlyWayUp 52c55227b2 feat(api): PDF Downloads functional! 2024-12-06 11:08:43 +00:00
TheOnlyWayUp ea9e415e52 fix(readme): Include instructions on fully-featured wkhtmltopdf installation 2024-12-06 11:08:19 +00:00
TheOnlyWayUp 36ccbb70eb fix(api): Dockerfile removes wkhtmltopdf deb file after installation 2024-12-06 11:07:57 +00:00
TheOnlyWayUp a025baded4 feat(api): PDF Downloads for single chapters functional 2024-12-06 10:56:30 +00:00
TheOnlyWayUp b05fe47914 feat(api): Errors are raised faster, add Exception classes 2024-12-06 07:56:08 +00:00
TheOnlyWayUp 0835992b23 feat(api): Add DownloadFormat type, restructure utils 2024-12-06 07:27:56 +00:00
TheOnlyWayUp 0f6cdd91a9 feat(api): Add pdfkit to requirements, wkhtml2pdf to Dockerfile 2024-12-06 07:20:21 +00:00
AaronBenDaniel f8900be6b3 fix: Add git to Dockerfile 2024-12-03 05:50:01 +05:30
TheOnlyWayUp a458b9c2f1 api: Update requirements.txt 2024-12-02 11:37:08 +00:00
TheOnlyWayUp 18d4df0674 api: Use keydb fork of aiohttp-client-cache
Natively expire hash key submembers
2024-12-02 11:25:32 +00:00
AaronBenDaniel c1db7babdd fix(frontend): Strip tracking info from URLs 2024-12-01 09:42:25 +05:30
TheOnlyWayUp f40d1e4b27 fix: README 2024-12-01 00:15:03 +00:00
TheOnlyWayUp 39837f6305 docs: Add Redis guide to README 2024-12-01 00:13:22 +00:00
TheOnlyWayUp 974c0bd341 fix(frontend): Update changelog 2024-12-01 00:04:52 +00:00
TheOnlyWayUp 5687c5f2cd fix(api): TTL for Redis Cache 2024-11-30 23:44:07 +00:00
Dhanush R 5f0676a19d Merge pull request #23 from TheOnlyWayUp/fix/#22-redis-cache
Concurrent requests fail

Co-authored-by: AaronBenDaniel <144371000+AaronBenDaniel@users.noreply.github.com>
2024-12-01 03:48:07 +05:30
AaronBenDaniel ec700ce284 fix(frontend): Remove unused function 2024-11-30 17:16:43 -05:00
AaronBenDaniel eafef1f1ec fix(frontend): Remove debug console.log() 2024-11-30 17:02:17 -05:00
TheOnlyWayUp 8e8773a61a fix(api): Lower logging status for debug message 2024-11-30 21:58:51 +00:00
TheOnlyWayUp 2b1d00b08e fix(frontend): Allow IDs to be typed 2024-11-30 21:53:16 +00:00
TheOnlyWayUp c29c26b33b Update requirements.txt 2024-11-30 21:38:12 +00:00
TheOnlyWayUp f91a01e574 feat(api): Add type validation for API Responses 2024-11-30 21:37:47 +00:00
TheOnlyWayUp a31c26f8c5 fix(api): Improve readability 2024-11-30 21:25:07 +00:00
TheOnlyWayUp 8b00d0b109 fix(api): Add logfiles to gitignore, remove debug code 2024-11-30 21:14:21 +00:00
TheOnlyWayUp 26b9db8945 fix(api): Remove unnecessary API Request, remove test script 2024-11-30 21:10:17 +00:00
TheOnlyWayUp a755ddb0e4 fix(api): Use CachedSession across codebase 2024-11-30 20:57:20 +00:00
TheOnlyWayUp 28e40ece94 feat(api): Add eliot logging, fix no cookies in authed requests 2024-11-30 20:54:59 +00:00
TheOnlyWayUp 6e222c1f55 feat(api): Cancel requests when client disconnects 2024-11-30 19:24:33 +00:00
TheOnlyWayUp 36c73d01e9 fix(api): Pydantic-settings for model-based env loading 2024-11-30 19:23:46 +00:00
TheOnlyWayUp 48fed5f0ce fix(api): Clean cached session usage 2024-11-30 16:54:14 +00:00
TheOnlyWayUp e3028867db fix(api): Default values for cache model 2024-11-30 16:53:22 +00:00
TheOnlyWayUp b1aa836254 feat(api): Add env config 2024-11-30 16:02:01 +00:00
TheOnlyWayUp 5ecbe028c3 feat(api): Conform to PEP 621
Start using Ruff/uv
2024-11-30 16:00:34 +00:00
Dhanush R 96877d9c9b feat(api): Descriptive error messages (#21 - @AaronBenDaniel)
Co-authored-by: AaronBenDaniel <144371000+AaronBenDaniel@users.noreply.github.com>
2024-11-28 18:40:13 +00:00
TheOnlyWayUp f9e27689e3 feat(api): Use FastAPI Error handler 2024-11-28 18:23:52 +00:00
AaronBenDaniel 308afde25f fix(api): Handle invalid part IDs 2024-11-24 21:42:52 -05:00
AaronBenDaniel fa1bac3045 feat(api): Add rate-limiting error message 2024-11-09 14:39:21 -05:00
AaronBenDaniel d58a119c10 feat(api): Invalid ID error message 2024-11-08 17:43:11 -05:00
Dhanush R 31b8d0c08c Update demo image on README 2024-11-09 03:27:09 +05:30
Dhanush R 40ae0fbb99 Update README.md 2024-11-09 00:15:53 +05:30
AaronBenDaniel af0981a679 fix(frontend): Help Modal updated for URLs (#18 - @AaronBenDaniel)
* fixed help modal

* fix(frontend): Update Help Modal

---------

Co-authored-by: TheOnlyWayUp <hi@towu.dev>
2024-11-08 23:12:38 +05:30
Dhanush R fc4866463f fix(frontend): Update donate link 2024-11-08 23:10:19 +05:30
AaronBenDaniel ca4697057c feat: Paste Links, Deprecate IDs (#17 - @AaronBenDaniel)
* deprecate Story IDs, require full URLs

* added FRONT-END ONLY support for part and list URLs

* add backend support for part IDs

* added backend support for lists

* Support enums

* Simplify and remove List support

* Update frontend

* Frontend: Revert dialog changes

* Remove List support

---------

Co-authored-by: TheOnlyWayUp <hi@towu.dev>
2024-11-07 08:39:34 +00:00
AaronBenDaniel e89dc7e699 Update featured image (#13 - @AaronBenDaniel)
* update featured image

* changed page format
2024-11-03 05:02:45 +05:30
Dhanush R d9c858b3b3 fix(api) - #11 Send to Kindle Support
* fix(api/image_downloads): Replace image url with file path

* fix(api/image_downloads): Add comments

* fix(frontend): Update changelog

* Support Send2Kindle

* Update changelog
2024-11-03 04:52:30 +05:30
Dhanush R c0695a9d17 fix(api/images): #14 - Image downloads functional
* fix(api/image_downloads): Replace image url with file path

* fix(api/image_downloads): Add comments

* fix(frontend): Update changelog
2024-11-02 03:07:41 +05:30
TheOnlyWayUp 75d42ba5ec fix: Style Discord Bot link 2024-10-06 08:33:11 +00:00
TheOnlyWayUp 33d6d912a2 feat: Add Discord Bot link 2024-10-06 06:10:45 +00:00
TheOnlyWayUp 9d7464b461 fix(frontend): Remove feedbackfish script 2024-09-17 18:23:26 +00:00
AaronBenDaniel 232795b050 fix(frontend): Download more button (#12 - @AaronBenDaniel)
* Fixed "Download More" button

* Revert "Fixed "Download More" button"

This reverts commit 620ad6afff.

* Reworked page reset

* fix(frontend): Download more button

---------

Co-authored-by: TheOnlyWayUp <hi@towu.dev>
2024-08-31 13:56:54 +05:30
TheOnlyWayUp 85bc4609c2 fix(frontend): Remove Query Params from ID-from-URL extraction 2024-07-11 15:28:45 +00:00
TheOnlyWayUp 3369325d03 fix(frontend): Populate download URL, accidentally removed 2024-07-10 14:06:06 +00:00
TheOnlyWayUp e16496ca94 fix(frontend): Reference @AaronBenDaniel's code for Story ID fetching from Part IDs 2024-07-09 15:29:19 +00:00
TheOnlyWayUp 3f9641d76a feat(frontend): Split pasted URLs to derive Story ID. Warn if Part ID 2024-07-09 15:08:46 +00:00
TheOnlyWayUp 868e02992b Update README 2024-07-08 12:59:49 +00:00
AaronBenDaniel 0184c786ce fix(frontend): URL Encode Username and Password (#9 - @AaronBenDaniel)
* add URI encoding to credentials

* chore(api): Comment on FastAPI's automatic URL Decode

---------

Co-authored-by: TheOnlyWayUp <hi@towu.dev>
2024-07-08 18:23:43 +05:30
TheOnlyWayUp b663448103 fix(frontend): Add todo for changelog on smaller screen sizes 2024-07-08 12:32:02 +00:00
AaronBenDaniel 0983c13da7 fix(api): Use HTML formatting consistently (#7 - @AaronBenDaniel) 2024-07-06 23:42:53 +05:30
TheOnlyWayUp 55763c1b99 fix(frontend): Update changelog 2024-06-30 20:13:52 +00:00
TheOnlyWayUp 9f24d437cb fix(frontend): Arabic language support 2024-06-30 20:09:07 +00:00
TheOnlyWayUp 79c9447cbe fix(frontend): Update changelog 2024-06-30 19:54:35 +00:00
29 changed files with 3680 additions and 321 deletions
+6
View File
@@ -1,6 +1,12 @@
__pycache__
venv
*epub
*pdf
*html
data
*ipynb
build
.vscode
.venv
.env
*log
+24
View File
@@ -0,0 +1,24 @@
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Python Debugger: FastAPI",
"type": "debugpy",
"request": "launch",
"module": "uvicorn",
"args": [
"main:app",
"--reload",
"--host",
"0.0.0.0",
"--port",
"8086"
],
"jinja": true,
"cwd": "${workspaceFolder}/src/api/src"
}
]
}
+34 -6
View File
@@ -12,14 +12,42 @@ RUN npm run build
FROM python:3.10-slim
WORKDIR /app
# Install apt-fast, git, exiftool
COPY --from=nobodyxu/apt-fast:latest-debian-buster-slim /usr/local/ /usr/local/
RUN apt update
RUN apt install -y aria2
RUN apt-fast install -y git build-essential libpango-1.0-0 libpangoft2-1.0-0 wget
ENV EXIFTOOL_VERSION="13.06"
RUN wget "https://exiftool.org/Image-ExifTool-${EXIFTOOL_VERSION}.tar.gz"
RUN gzip -dc "Image-ExifTool-${EXIFTOOL_VERSION}.tar.gz" | tar -xf -
WORKDIR /app/Image-ExifTool-${EXIFTOOL_VERSION}
RUN perl Makefile.PL
RUN make test
RUN make install
RUN rm -rf /var/lib/apt/lists/* /app/Image-ExifTool-${EXIFTOOL_VERSION}
WORKDIR /app
# --- #
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
COPY src/api/requirements.txt requirements.txt
RUN pip3 install -r requirements.txt
COPY --from=0 /build/build /app/build
# COPY src/api/src/.env .env
COPY src/api/src .
COPY src/api/exiftool.config exiftool.config
RUN uv pip install -r requirements.txt --system
COPY --from=0 /build/build /app/src/build
COPY src/api/src src
# Is this still needed?
RUN ln -s /app/src/pdf/fonts /tmp/fonts
WORKDIR /app/src
EXPOSE 80
# ENV PORT=80
CMD [ "python3", "main.py"]
+28 -6
View File
@@ -2,20 +2,24 @@ WattpadDownloader ([Demo](https://wpd.rambhat.la))
---
Straightforward, Extendable WebApp to download Wattpad Books as EPUB Files.
![image](https://github.com/TheOnlyWayUp/WattpadDownloader/assets/76237496/8a3fda0b-b851-4c5f-9306-ba9c17cdcc8b)
![image](https://github.com/user-attachments/assets/b9d87d6b-5302-4561-98b0-d7f95bff9f04)
Stars ⭐ are appreciated. Thanks!
## Features
- ⚡ Lightweight Frontend and Minimal Javascript.
- ⚡ Lightweight Frontend.
- 🪙 Supports Authentication (Download paid stories from your account!)
- 🌐 API Support (Visit the `/docs` path on your instance for more.)
- 🐇 Fast Generation, Basic Ratelimit Handling.
- 🐇 Fast Generation
- 🗃️ Caching, Ratelimit handling
- 🐳 Docker Support
- 🏷️ Generated EPUB File includes Metadata. (Dublin Core Spec)
- 📖 Plays well with E-Readers. (Kindle Support if KOReader present)
- 🏷️ Generated books contain metadata, supported by Calibre and other E-Book Software.
- 📖 Plays well with E-Readers. (Send2Kindle, KOReader, ReMarkable, KOBO, Calibre Reader...)
- 💻 Easily Hackable. Extend with ease.
Still not convinced? Take a look some [sample downloads](./samples/).
## Set Up
1. Clone the repository: `git clone https://github.com/TheOnlyWayUp/WattpadDownloader/ && cd WattpadDownloader`
@@ -24,6 +28,24 @@ Stars ⭐ are appreciated. Thanks!
That's it! You can use your instance at `http://localhost:5042`. API Documentation is available at `http://localhost:5042/docs`.
### Concurrent Requests
The file-based cache struggles with concurrent requests (discussed in TheOnlyWayUp/WattpadDownloader#2 and TheOnlyWayUp/WattpadDownloader#22). If you're downloading a large number of books concurrently, switch to the Redis cache. Assuming you've built the image already:
1. Fill the .env file. Localhost will not work in a docker container unless [`host.docker.internal`](https://docs.docker.com/desktop/features/networking/#i-want-to-connect-from-a-container-to-a-service-on-the-host) or a platform-specific variant is provided.
```
USE_CACHE=true
CACHE_TYPE=redis
REDIS_CONNECTION_URL=redis://username:password@host:port
```
2. Run the container and supply the .env file, `docker run -d -p 5042:80 --env-file .env wp_downloader`
Alternatively, if Redis is running on localhost
2. Modify your `.env` file, replacing `localhost` with `host.docker.internal`. `redis://localhost:6379` should become `redis://host.docker.internal:6379`. Then, start the container, `docker run -d -p 5042:80 --env-file .env --add-host host.docker.internal:host-gateway wp_downloader`
## Development
- Developers, ensure you have `wkhtmltopdf` available on your PATH.
- Run `wkhtmltopdf` on your terminal, if you see "Reduced Functionality", run [this script](https://raw.githubusercontent.com/JazzCore/python-pdfkit/b7bf798b946fa5655f8e82f0d80dec6b6b13d414/ci/before-script.sh) to install a fully featured compilation of `wkhtmltopdf.
---
My thanks to [aerkalov/ebooklib](https://github.com/aerkalov/ebooklib) for a fast and well-documented package.
@@ -31,5 +53,5 @@ My thanks to [aerkalov/ebooklib](https://github.com/aerkalov/ebooklib) for a fas
---
<div align="center">
<p>TheOnlyWayUp © 2023</p>
<p>TheOnlyWayUp © 2024</p>
</div>
BIN
View File
Binary file not shown.

After

Width:  |  Height:  |  Size: 264 KiB

Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
+3
View File
@@ -0,0 +1,3 @@
USE_CACHE=true
CACHE_TYPE=file
REDIS_CONNECTION_URL=
+1
View File
@@ -0,0 +1 @@
3.10
+26
View File
@@ -0,0 +1,26 @@
%Image::ExifTool::UserDefined = (
'Image::ExifTool::XMP::xmp' => {
Completed => {
Writable => 'boolean', # Can be a boolean (True/False)
Groups => { 2 => 'Content' },
},
MatureContent => {
Writable => 'boolean', # Can be a boolean (True/False)
Groups => { 2 => 'Content' },
},
},
'Image::ExifTool::IPTC::ApplicationRecord' => {
161 => {
Name => 'Completed',
Format => 'string[0,16]', # Store as a string (e.g., "Yes"/"No")
},
162 => {
Name => 'MatureContent',
Format => 'string[0,16]', # Store as a string (e.g., "Yes"/"No")
},
},
);
1; # End
+28
View File
@@ -0,0 +1,28 @@
[project]
name = "api"
version = "0.1.0"
description = "Wattpad Downloader API"
readme = "../../README.md"
requires-python = ">=3.10"
dependencies = [
"aiohttp>=3.9.1",
"rich>=13.9.4",
"fastapi>=0.115.5",
"ebooklib>=0.18",
"python-dotenv>=1.0.1",
"pydantic-settings>=2.6.1",
"eliot>=1.16.0",
"type-extensions>=0.1.2",
"backoff>=2.2.1",
"aiohttp-client-cache[all]",
"bs4>=0.0.2",
"uvicorn>=0.32.1",
"pyexiftool>=0.5.6",
"weasyprint>=63.0",
]
[tool.ruff.lint]
ignore = ['E402']
[tool.uv.sources]
aiohttp-client-cache = { git = "https://github.com/TheOnlyWayUp/aiohttp-client-cache.git", rev = "keydb-ttl" }
+60 -47
View File
@@ -1,62 +1,75 @@
aiofiles==23.2.1
aiohttp==3.9.1
aiohttp-client-cache==0.10.0
aioboto3==13.2.0
aiobotocore==2.15.2
aiofiles==24.1.0
aiohappyeyeballs==2.4.4
aiohttp==3.11.9
aiohttp-client-cache @ git+https://github.com/TheOnlyWayUp/aiohttp-client-cache.git@1f94f1d751e7320c0ea981d532ff02924782dae6
aioitertools==0.12.0
aiosignal==1.3.1
aiosqlite==0.19.0
annotated-types==0.6.0
anyio==4.2.0
asttokens==2.4.1
aiosqlite==0.20.0
annotated-types==0.7.0
anyio==4.6.2.post1
async-timeout==4.0.3
attrs==23.1.0
backoff==2.2.1
beautifulsoup4==4.12.3
boltons==24.1.0
boto3==1.35.36
botocore==1.35.36
brotli==1.1.0
bs4==0.0.2
cffi==1.17.1
click==8.1.7
comm==0.2.0
debugpy==1.8.0
decorator==5.1.1
EbookLib==0.18
exceptiongroup==1.2.0
executing==2.0.1
fastapi==0.108.0
cssselect2==0.7.0
dnspython==2.7.0
ebooklib==0.18
eliot==1.16.0
exceptiongroup==1.2.2
fastapi==0.115.5
fonttools==4.55.2
frozenlist==1.4.1
h11==0.14.0
idna==3.6
ipykernel==6.28.0
ipython==8.19.0
itsdangerous==2.1.2
jedi==0.19.1
jupyter_client==8.6.0
jupyter_core==5.5.1
lxml==4.9.4
itsdangerous==2.2.0
jmespath==1.0.1
lxml==5.3.0
markdown-it-py==3.0.0
matplotlib-inline==0.1.6
mdurl==0.1.2
motor==3.6.0
multidict==6.0.4
nest-asyncio==1.5.8
packaging==23.2
parso==0.8.3
pexpect==4.9.0
platformdirs==4.1.0
prompt-toolkit==3.0.43
psutil==5.9.7
ptyprocess==0.7.0
pure-eval==0.2.2
pydantic==2.5.3
pydantic_core==2.14.6
Pygments==2.17.2
python-dateutil==2.8.2
pyzmq==25.1.2
rich==13.7.0
orjson==3.10.12
pillow==10.4.0
propcache==0.2.1
pycparser==2.22
pydantic==2.10.2
pydantic-core==2.27.1
pydantic-settings==2.6.1
pydyf==0.11.0
pyexiftool==0.5.6
pygments==2.18.0
pymongo==4.9.2
pyphen==0.15.0
pyrsistent==0.20.0
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
redis==5.2.0
rich==13.9.4
s3transfer==0.10.4
setuptools==75.6.0
six==1.16.0
sniffio==1.3.0
soupsieve==2.5
stack-data==0.6.3
starlette==0.32.0.post1
tornado==6.4
traitlets==5.14.0
typing_extensions==4.9.0
sniffio==1.3.1
soupsieve==2.6
starlette==0.41.3
tinycss2==1.4.0
tinyhtml5==2.0.0
type-extensions==0.1.2
typing-extensions==4.12.2
url-normalize==1.4.3
uvicorn==0.25.0
wcwidth==0.2.12
yarl==1.9.4
urllib3==2.2.3
uvicorn==0.32.1
weasyprint==63.0
webencodings==0.5.1
wrapt==1.17.0
yarl==1.18.3
zope-interface==7.2
zopfli==0.2.3.post1
+708 -148
View File
@@ -1,59 +1,196 @@
import asyncio
from typing import Optional
from ebooklib import epub
import unicodedata
from __future__ import annotations
from typing import List, Optional, Tuple, cast
from typing_extensions import TypedDict
import re
import logging
import tempfile
import unicodedata
from os import environ
from io import BytesIO
from enum import Enum
from base64 import b64encode
import bs4
import backoff
from aiohttp import ClientResponseError, ClientSession
from aiohttp_client_cache.session import CachedSession
from aiohttp_client_cache import FileBackend
from weasyprint import HTML, CSS, default_url_fetcher
from weasyprint.text.fonts import FontConfiguration
from ebooklib import epub
from exiftool import ExifTool
from eliot import to_file, start_action
from eliot.stdlib import EliotHandler
from bs4 import BeautifulSoup
from dotenv import load_dotenv
from pydantic import TypeAdapter, model_validator, field_validator
from pydantic_settings import BaseSettings
from aiohttp import ClientResponseError
from aiohttp_client_cache.session import CachedSession
from aiohttp_client_cache import FileBackend, RedisBackend
load_dotenv(override=True)
handler = EliotHandler()
logging.getLogger("fastapi").setLevel(logging.INFO)
logging.getLogger("fastapi").addHandler(handler)
exiftool_logger = logging.getLogger("exiftool")
exiftool_logger.addHandler(handler)
logger = logging.Logger("wpd")
logger.addHandler(handler)
if environ.get("DEBUG"):
to_file(open("eliot.log", "wb"))
# --- #
class CacheTypes(Enum):
file = "file"
redis = "redis"
class Config(BaseSettings):
USE_CACHE: bool = True
CACHE_TYPE: CacheTypes = CacheTypes.file
REDIS_CONNECTION_URL: str = ""
@field_validator("USE_CACHE", mode="before")
def validate_use_cache(cls, value):
# Return default if value is an empty string
if value == "":
return True # Default value for USE_CACHE
return value
@field_validator("CACHE_TYPE", mode="before")
def validate_cache_type(cls, value):
# Thanks https://stackoverflow.com/a/78157474
if value == "":
return "file"
return value
@model_validator(mode="after")
def prevent_mismatched_redis_url(self):
match self.CACHE_TYPE:
case CacheTypes.file:
if self.REDIS_CONNECTION_URL:
raise ValueError(
"REDIS_CONNECTION_URL provided when File cache selected. To use Redis as a cache, set CACHE_TYPE=redis."
)
case CacheTypes.redis:
if not self.REDIS_CONNECTION_URL:
raise ValueError(
"REDIS_CONNECTION_URL not provided when Redis cache selected. To use File cache, set CACHE_TYPE=file."
)
return self
config = Config()
# --- #
headers = {
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36"
}
cache = FileBackend(use_temp=True, expire_after=43200) # 12 hours
if config.USE_CACHE:
match config.CACHE_TYPE:
case CacheTypes.file:
cache = FileBackend(use_temp=True, expire_after=43200) # 12 hours
case CacheTypes.redis:
cache = RedisBackend(
cache_name="wpd-aiohttp-cache",
address=config.REDIS_CONNECTION_URL,
expire_after=43200, # 12 hours
)
else:
cache = None
logger.info(f"Using {cache=}")
# --- Utilities --- #
async def wp_get_cookies(username: str, password: str) -> dict:
# source: https://github.com/TheOnlyWayUp/WP-DM-Export/blob/dd4c7c51cb43f2108e0f63fc10a66cd24a740e4e/src/API/src/main.py#L25-L58
"""Retrieves authorization cookies from Wattpad by logging in with user creds.
def smart_trim(text: str, max_length: int = 400) -> str:
"""Truncate a string intelligently at newlines. Coherence and max-length adherence."""
chunks = [t for t in text.split("\n") if t]
Args:
username (str): Username.
password (str): Password.
to_return = ""
for chunk in chunks:
if len(to_return) + len(chunk) < max_length:
to_return = chunk + "<br />"
else:
to_return = to_return.rstrip("<br />")
break
Raises:
ValueError: Bad status code.
ValueError: No cookies returned.
return to_return
Returns:
dict: Authorization cookies.
"""
async with ClientSession(headers=headers) as session:
async with session.post(
"https://www.wattpad.com/auth/login?nextUrl=%2F&_data=routes%2Fauth.login",
data={
"username": username.lower(),
"password": password,
}, # the username.lower() is for caching
) as response:
if response.status != 204:
raise ValueError("Not a 204.")
cookies = {
k: v.value
for k, v in response.cookies.items() # Thanks https://stackoverflow.com/a/32281245
}
def generate_clean_part_html(part: Part, content: str) -> bs4.Tag:
"""Rebuild HTML Structure for a Part."""
chapter_title = part["title"]
chapter_id = part["id"]
if not cookies:
raise ValueError("No cookies.")
clean = BeautifulSoup(
f"""
<section id="section_{chapter_id}">
<h1 id="{chapter_id}" class="chapter-title">{chapter_title}</h1>
</section>
""",
"html.parser",
) # html.parser doesn't create <html>/<body> tags automatically
return cookies
html = BeautifulSoup(content, "lxml")
for br in html.find_all("br"):
# Check if no content after br
if not br.next_sibling or br.next_sibling.name in ["br", None]:
br.decompose()
section = cast(bs4.Tag, clean.find("section"))
if not section:
raise Exception()
for child in html.find_all("p"):
current_paragraph = clean.new_tag("p")
# Attempt to carry over paragraph styling
try:
current_paragraph["style"] = child["style"]
except:
current_paragraph["style"] = "text-align: left;"
for p_child in list(child.children):
if not p_child:
continue
if isinstance(p_child, bs4.element.Tag):
if p_child.name == "br":
p_child.decompose()
elif p_child.name == "img":
src = p_child["src"]
img_tag = clean.new_tag("img")
img_tag["src"] = src
section.append(img_tag)
section.append(clean.new_tag("br"))
elif p_child.name in ["b", "i"]:
styled_tag = clean.new_tag(p_child.name)
styled_content = clean.new_string(p_child.text)
styled_tag.append(styled_content)
current_paragraph.append(styled_tag)
else:
# Append any other tags as-is
current_paragraph.append(p_child)
elif isinstance(p_child, bs4.element.NavigableString):
content = clean.new_string(p_child)
current_paragraph.append(content)
if current_paragraph.contents:
section.append(current_paragraph)
if not list(child.children):
# Some p tags only contain brs, once brs are removed, they are empty and can be removed as well.
child.decompose()
return section
def slugify(value, allow_unicode=False) -> str:
@@ -79,148 +216,571 @@ def slugify(value, allow_unicode=False) -> str:
return re.sub(r"[-\s]+", "-", value).strip("-_")
async def fetch_cookies(username: str, password: str) -> dict:
# source: https://github.com/TheOnlyWayUp/WP-DM-Export/blob/dd4c7c51cb43f2108e0f63fc10a66cd24a740e4e/src/API/src/main.py#L25-L58
"""Retrieves authorization cookies from Wattpad by logging in with user creds.
Args:
username (str): Username.
password (str): Password.
Raises:
ValueError: Bad status code.
ValueError: No cookies returned.
Returns:
dict: Authorization cookies.
"""
with start_action(action_type="api_fetch_cookies"):
async with CachedSession(headers=headers, cache=None) as session:
async with session.post(
"https://www.wattpad.com/auth/login?nextUrl=%2F&_data=routes%2Fauth.login",
data={
"username": username.lower(),
"password": password,
}, # the username.lower() is for caching
) as response:
if response.status != 204:
raise ValueError("Not a 204.")
cookies = {
k: v.value
for k, v in response.cookies.items() # Thanks https://stackoverflow.com/a/32281245
}
if not cookies:
raise ValueError("No cookies.")
return cookies
# --- Models --- #
class CopyrightData(TypedDict):
name: str
statement: str
freedoms: str
printing: str
image_url: Optional[str]
class Language(TypedDict):
name: str
class User(TypedDict):
username: str
avatar: str
description: str
class Part(TypedDict):
id: int
title: str
class Story(TypedDict):
id: str
title: str
createDate: str
modifyDate: str
language: Language
user: User
description: str
cover: str
completed: bool
tags: List[str]
mature: bool
url: str
parts: List[Part]
isPaywalled: bool
copyright: int
story_ta = TypeAdapter(Story)
# --- Exceptions --- #
class WattpadError(Exception):
"""Base Exception class for Wattpad related errors."""
class StoryNotFoundError(WattpadError):
"""Display the "This story was not found" error to the user."""
...
class PartNotFoundError(StoryNotFoundError): ...
# --- API Calls --- #
@backoff.on_exception(backoff.expo, ClientResponseError, max_time=15)
async def retrieve_story(story_id: int, cookies: Optional[dict] = None) -> dict:
"""Taking a story_id, return its information from the Wattpad API."""
async with (
CachedSession(headers=headers, cache=cache)
if not cookies
else ClientSession(headers=headers, cookies=cookies)
) as session: # Don't cache requests with Cookies.
async with session.get(
f"https://www.wattpad.com/api/v3/stories/{story_id}?fields=tags,id,title,createDate,modifyDate,language(name),description,completed,mature,url,isPaywalled,user(username),parts(id,title),cover"
) as response:
if not response.ok:
if response.status in [404, 400]:
return {}
response.raise_for_status()
async def fetch_story_from_partId(
part_id: int, cookies: Optional[dict] = None
) -> Tuple[int, Story]:
"""Fetch Story metadata from a Part ID."""
with start_action(action_type="api_fetch_storyFromPartId"):
async with CachedSession(
headers=headers, cache=None if cookies else cache
) as session: # Don't cache requests with Cookies.
async with session.get(
f"https://www.wattpad.com/api/v3/story_parts/{part_id}?fields=groupId,group(tags,id,title,createDate,modifyDate,language(name),description,completed,mature,url,isPaywalled,user(username,avatar,description),parts(id,title),cover,copyright)"
) as response:
body = await response.json()
body = await response.json()
if response.status == 400:
match body.get("error_code"):
case 1020: # "Story part not found"
logger.info(f"{part_id=} not found on Wattpad, returning.")
raise PartNotFoundError()
return body
response.raise_for_status()
return int(body["groupId"]), story_ta.validate_python(body["group"])
@backoff.on_exception(backoff.expo, ClientResponseError, max_time=15)
async def fetch_part_content(part_id: int, cookies: Optional[dict] = None) -> str:
"""Return the HTML Content of a Part."""
async with (
CachedSession(headers=headers, cache=cache)
if not cookies
else ClientSession(headers=headers, cookies=cookies)
) as session: # Don't cache requests with Cookies.
async with session.get(
f"https://www.wattpad.com/apiv2/?m=storytext&id={part_id}"
) as response:
if not response.ok:
if response.status in [404, 400]:
return ""
response.raise_for_status()
async def fetch_story(story_id: int, cookies: Optional[dict] = None) -> Story:
"""Fetch Story metadata from a Story ID."""
with start_action(action_type="api_fetch_story", story_id=story_id):
async with CachedSession(
headers=headers, cookies=cookies, cache=None if cookies else cache
) as session:
async with session.get(
f"https://www.wattpad.com/api/v3/stories/{story_id}?fields=tags,id,title,createDate,modifyDate,language(name),description,completed,mature,url,isPaywalled,user(username,avatar,description),parts(id,title),cover,copyright"
) as response:
body = await response.json()
body = await response.text()
if response.status == 400:
match body.get("error_code"):
case 1017: # "Story not found"
logger.info(f"{story_id=} not found on Wattpad, returning.")
raise StoryNotFoundError()
return body
response.raise_for_status()
return story_ta.validate_python(body)
@backoff.on_exception(backoff.expo, ClientResponseError, max_time=15)
async def fetch_cover(url: str, cookies: Optional[dict] = None) -> bytes:
async def fetch_story_content_zip(
story_id: int, cookies: Optional[dict] = None
) -> BytesIO:
"""BytesIO Stream of an Archive of Part Contents for a Story."""
with start_action(action_type="api_fetch_storyZip", story_id=story_id):
async with CachedSession(
headers=headers,
cookies=cookies,
cache=None if cookies else cache,
) as session:
async with session.get(
f"https://www.wattpad.com/apiv2/?m=storytext&group_id={story_id}&output=zip"
) as response:
response.raise_for_status()
bytes_stream = BytesIO(await response.read())
return bytes_stream
@backoff.on_exception(backoff.expo, ClientResponseError, max_time=15)
async def fetch_image(url: str, should_cache: bool = False) -> bytes:
"""Fetch image bytes."""
async with (
CachedSession(headers=headers, cache=cache)
if not cookies
else ClientSession(headers=headers, cookies=cookies)
) as session: # Don't cache requests with Cookies.
async with session.get(url) as response:
if not response.ok:
if response.status in [404, 400]:
return bytes()
response.raise_for_status()
with start_action(action_type="api_fetch_image", url=url):
async with CachedSession(
headers=headers, cache=cache if should_cache else None
) as session: # Don't cache images.
async with session.get(url) as response:
response.raise_for_status()
body = await response.read()
body = await response.read()
return body
return body
# --- EPUB Generation --- #
# --- Generation --- #
def set_metadata(book, data):
book.add_author(data["user"]["username"])
class EPUBGenerator:
"""EPUB Generation utilities"""
book.add_metadata("DC", "description", data["description"])
book.add_metadata("DC", "created", data["createDate"])
book.add_metadata("DC", "modified", data["modifyDate"])
book.add_metadata("DC", "language", data["language"]["name"])
def __init__(self, data: Story, cover: bytes):
"""Initialize EPUBGenerator. Create epub.EpubBook() and set metadata and cover."""
self.epub = epub.EpubBook()
self.data = data
self.cover = cover
book.add_metadata(
None, "meta", "", {"name": "tags", "content": ", ".join(data["tags"])}
)
book.add_metadata(
None, "meta", "", {"name": "mature", "content": str(int(data["mature"]))}
)
book.add_metadata(
None, "meta", "", {"name": "completed", "content": str(int(data["completed"]))}
)
# set metadata, defined in https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#section-2
self.epub.add_author(data["user"]["username"])
self.epub.add_metadata("DC", "title", data["title"])
self.epub.add_metadata("DC", "description", data["description"])
self.epub.add_metadata("DC", "date", data["createDate"])
self.epub.add_metadata("DC", "modified", data["modifyDate"])
self.epub.add_metadata("DC", "language", data["language"]["name"])
async def set_cover(book, data, cookies: Optional[dict] = None):
book.set_cover("cover.jpg", await fetch_cover(data["cover"], cookies=cookies))
async def add_chapters(
book, data, download_images: bool = False, cookies: Optional[dict] = None
):
chapters = []
for part in data["parts"]:
content = await fetch_part_content(part["id"], cookies=cookies)
title = part["title"]
clean_title = slugify(title)
# Thanks https://eu17.proxysite.com/process.php?d=5VyWYcoQl%2BVF0BYOuOavtvjOloFUZz2BJ%2Fepiusk6Nz7PV%2B9i8rs7cFviGftrBNll%2B0a3qO7UiDkTt4qwCa0fDES&b=1
chapter = epub.EpubHtml(
title=title,
file_name=f"{clean_title}.xhtml",
lang=data["language"]["name"],
self.epub.add_metadata(
None, "meta", "", {"name": "tags", "content": ", ".join(data["tags"])}
)
self.epub.add_metadata(
None, "meta", "", {"name": "mature", "content": str(int(data["mature"]))}
)
self.epub.add_metadata(
None,
"meta",
"",
{"name": "completed", "content": str(int(data["completed"]))},
)
if download_images:
soup = BeautifulSoup(content, "lxml")
async with (
CachedSession(headers=headers, cache=cache)
if not cookies
else ClientSession(headers=headers, cookies=cookies)
) as session: # Don't cache requests with Cookies.
for idx, image in enumerate(soup.find_all("img")):
if not image["src"]:
continue
async with session.get(image["src"]) as response:
img = epub.EpubImage(
media_type="image/jpeg",
content=await response.read(),
file_name=f"static/{clean_title}/{idx}.jpeg",
)
book.add_item(img)
content = content.replace(
str(image), f'<img src="static/{clean_title}/{idx}.jpeg"/>'
)
# Set cover
self.epub.set_cover("cover.jpg", cover)
cover_chapter = epub.EpubHtml(
file_name="titlepage.xhtml", # Standard for cover page
)
cover_chapter.set_content('<img src="cover.jpg">')
self.epub.add_item(cover_chapter)
chapter.set_content(f"<h1>{title}</h1>" + content)
async def add_chapters(
self, contents: List[bs4.Tag], download_images: bool = False
):
"""Add chapters to the Epub, downloading images if necessary. Sets the table of contents and spine."""
chapters: List[epub.EpubHtml] = []
chapters.append(chapter)
for cidx, (part, content) in enumerate(zip(self.data["parts"], contents)):
title = part["title"]
title = re.sub(r'[\x00-\x1F\x7F]', '', title) # Remove control characters
yield title # Yield the chapter's title upon insertion preceeded by retrieval.
# Thanks https://eu17.proxysite.com/process.php?d=5VyWYcoQl%2BVF0BYOuOavtvjOloFUZz2BJ%2Fepiusk6Nz7PV%2B9i8rs7cFviGftrBNll%2B0a3qO7UiDkTt4qwCa0fDES&b=1
chapter = epub.EpubHtml(
title=title,
file_name=f"{cidx}_{part['id']}.xhtml", # See issue #30
lang=self.data["language"]["name"],
uid=str(part["id"]).encode(),
)
for chapter in chapters:
book.add_item(chapter)
str_content = content.prettify()
if download_images:
soup = content
book.toc = tuple(chapters)
async with CachedSession(
headers=headers, cache=None
) as session: # Don't cache images.
for idx, image in enumerate(soup.find_all("img")):
if not image["src"]:
continue
# Find all image tags and filter for those with sources
# Thanks https://github.com/aerkalov/ebooklib/blob/master/samples/09_create_image/create.py
book.add_item(epub.EpubNcx())
book.add_item(epub.EpubNav())
async with session.get(image["src"]) as response:
img = epub.EpubImage(
media_type="image/jpeg",
content=await response.read(),
file_name=f"static/{cidx}/{idx}.jpeg",
)
self.epub.add_item(img)
# Fetch image and pack
# create spine
book.spine = ["nav"] + chapters
str_content = str_content.replace(
str(image["src"]), f"static/{cidx}/{idx}.jpeg"
)
chapter.set_content(str_content)
self.epub.add_item(chapter)
chapters.append(chapter)
yield title
self.epub.toc = chapters
# Thanks https://github.com/aerkalov/ebooklib/blob/master/samples/09_create_image/create.py
self.epub.add_item(epub.EpubNcx())
self.epub.add_item(epub.EpubNav())
# create spine
self.epub.spine = ["nav"] + chapters
def dump(self) -> BytesIO:
# Thanks https://stackoverflow.com/a/75398222
buffer = BytesIO()
epub.write_epub(buffer, self.epub)
buffer.seek(0)
return buffer
class PDFGenerator:
"""PDF Generation utilities"""
def __init__(self, data: Story, cover: bytes):
"""Initialize PDGenerator, create PDF Temporary file."""
self.data = data
self.file = tempfile.NamedTemporaryFile(suffix=".pdf", delete=True)
self.cover = cover
self.content: str = ""
self.copyright = {
1: {
"name": "All Rights Reserved",
"statement": "©️ {published_year} by {username}. All Rights Reserved.",
"freedoms": "No reuse, redistribution, or modification without permission.",
"printing": "Not allowed without explicit permission.",
"image_url": None,
},
2: {
"name": "Public Domain",
"statement": "This work is in the public domain. Originally published in {published_year} by {username}.",
"freedoms": "Free to use for any purpose without permission.",
"printing": "Allowed for personal or commercial purposes.",
"image_url": "http://mirrors.creativecommons.org/presskit/buttons/88x31/png/cc-zero.png",
},
3: {
"name": "Creative Commons Attribution (CC-BY)",
"statement": "©️ {published_year} by {username}. This work is licensed under a Creative Commons Attribution 4.0 International License.",
"freedoms": "Allows reuse, redistribution, and modification with credit to the author.",
"printing": "Allowed with proper credit.",
"image_url": "https://mirrors.creativecommons.org/presskit/buttons/88x31/png/by.png",
},
4: {
"name": "CC Attribution NonCommercial (CC-BY-NC)",
"statement": "©️ {published_year} by {username}. This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.",
"freedoms": "Allows reuse and modification for non-commercial purposes with credit.",
"printing": "Allowed for non-commercial purposes with proper credit.",
"image_url": "http://mirrors.creativecommons.org/presskit/buttons/88x31/png/by-nc.png",
},
5: {
"name": "CC Attribution NonCommercial NoDerivs (CC-BY-NC-ND)",
"statement": "©️ {published_year} by {username}. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License.",
"freedoms": "Allows sharing in original form for non-commercial purposes with credit; no modifications allowed.",
"printing": "Allowed for non-commercial purposes in original form with proper credit.",
"image_url": "http://mirrors.creativecommons.org/presskit/buttons/88x31/png/by-nc-nd.png",
},
6: {
"name": "CC Attribution NonCommercial ShareAlike (CC-BY-NC-SA)",
"statement": "©️ {published_year} by {username}. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.",
"freedoms": "Allows reuse and modification for non-commercial purposes under the same license, with credit.",
"printing": "Allowed for non-commercial purposes with proper credit under the same license.",
"image_url": "http://mirrors.creativecommons.org/presskit/buttons/88x31/png/by-nc-sa.png",
},
7: {
"name": "CC Attribution ShareAlike (CC-BY-SA)",
"statement": "©️ {published_year} by {username}. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.",
"freedoms": "Allows reuse and modification for any purpose under the same license, with credit.",
"printing": "Allowed with proper credit under the same license.",
"image_url": "https://mirrors.creativecommons.org/presskit/buttons/88x31/png/by-sa.png",
},
8: {
"name": "CC Attribution NoDerivs (CC-BY-ND)",
"statement": "©️ {published_year} by {username}. This work is licensed under a Creative Commons Attribution-NoDerivs 4.0 International License.",
"freedoms": "Allows sharing in original form for any purpose with credit; no modifications allowed.",
"printing": "Allowed in original form with proper credit.",
"image_url": "https://mirrors.creativecommons.org/presskit/buttons/88x31/png/by-nd.png",
},
}
with open("./pdf/stylesheet.css") as reader:
self.stylesheet = reader.read()
with open("./pdf/book.html") as reader:
self.template = reader.read()
async def generate_cover_and_copyright_html(
self,
) -> str:
"""Generate Cover and Copyright file, fetch copyright image (cached), use self.cover for cover."""
copyright_data = self.copyright[self.data["copyright"]]
template = self.template
about_copyright = (
template.replace(
"{statement}",
copyright_data["statement"].format(
username=self.data["user"]["username"],
published_year=self.data["createDate"].split("-", 2)[0],
),
)
.replace("{author}", self.data["user"]["username"])
.replace("{freedoms}", copyright_data["freedoms"])
.replace(
"{printing}",
copyright_data["printing"],
)
.replace("{book_id}", self.data["id"])
.replace("{book_title}", self.data["title"])
)
copyright_image = (
await fetch_image(copyright_data["image_url"], should_cache=True)
if copyright_data["image_url"]
else None
)
image_block = (
"""<img src="{image_url}"
alt="{name}"
width="88"
height="31"
id="copyright-license-image">""".format(
image_url=f"data:image/jpg;base64,{b64encode(copyright_image).decode()}",
name=copyright_data["name"],
)
if copyright_image
else ""
)
about_copyright = (
about_copyright.replace(
"{copyright_image}",
image_block,
)
if image_block
else about_copyright.replace("{copyright_image}", "")
)
about_copyright = about_copyright.replace(
"{cover}", f"data:image/jpg;base64,{b64encode(self.cover).decode()}"
)
self.template = about_copyright
return about_copyright
async def generate_about_author_chapter(self) -> str:
"""Generate About the Author file, fetch avatar."""
author_avatar = (
await fetch_image(
self.data["user"]["avatar"].replace("128", "512")
) # Increase image resolution
if self.data["user"]["avatar"]
else None
)
about_author = self.template.replace(
"{username}", self.data["user"]["username"]
).replace("{description}", smart_trim(self.data["user"]["description"]))
about_author = (
about_author.replace(
"{avatar}",
f"""
<img src="data:image/jpg;base64,{b64encode(author_avatar).decode()}" alt="Author's profile picture" id="author-profile-picture">""",
)
if author_avatar
else about_author.replace("{avatar}", "")
)
self.template = about_author
return about_author
def generate_toc(self):
ids = [part["id"] for part in self.data["parts"]]
clean = BeautifulSoup(
"""
<section id="contents" class="toc">
<h1>Table of Contents</h1>
<ul></ul>
</section>
""",
"html.parser",
) # html.parser doesn't create <html>/<body> tags automatically
ul = cast(bs4.Tag, clean.find("ul"))
for part_id in ids:
li = clean.new_tag("li")
a = clean.new_tag("a")
a["href"] = f"#{part_id}"
li.append(a)
ul.append(li)
insert_point = cast(bs4.Tag, self.tree.find("div", {"id": "book"}))
insert_point.append(clean)
return str(clean)
async def add_chapters(
self, contents: List[bs4.Tag], download_images: bool = False
):
"""Add chapters to the PDF, downloading images if necessary. Also add Cover, Copyright, and About the Author pages."""
# # Cover and Copyright Page
await self.generate_cover_and_copyright_html()
await self.generate_about_author_chapter()
self.tree = BeautifulSoup(self.template, "lxml")
self.generate_toc()
for part, content in zip(self.data["parts"], contents):
insert_point = cast(bs4.Tag, self.tree.find("div", {"id": "book"}))
insert_point.append(content)
yield part["title"]
# # About the Author page
# about_author_html = await self.generate_about_author_chapter()
# chapters.insert(0, cover_and_copyright_html)
# chapters.append(about_author_html)
with start_action(
action_type="generate_pdf",
output_filename=self.file.name,
title=self.data["title"],
):
# PDF Generation with wkhtmltopdf, written to self.file
# At this stage, we have a bunch of HTML Files representing all the chapters that need to be generated. PDFKit handles ToC generation, so that's not included.
font_config = FontConfiguration()
stylesheet_obj = CSS(string=self.stylesheet, font_config=font_config)
html_obj = HTML(string=str(self.tree))
html_obj.write_pdf(
self.file.name, stylesheets=[stylesheet_obj], font_config=font_config
)
with start_action(action_type="add_metadata") as action:
# Metadata generation with Exiftool
clean_description = (
self.data["description"].strip().replace("\n", "$/")
) # exiftool doesn't parse \ns correctly, they support $/ for the same instead. `&#xa;` is another option.
action.log(f"clean_description: {clean_description}")
metadata = {
"Author": self.data["user"]["username"],
"Title": self.data["title"],
"Subject": clean_description,
"CreationDate": self.data["createDate"],
"ModDate": self.data["modifyDate"],
"Keywords": ",".join(self.data["tags"]),
"Language": self.data["language"]["name"],
"Completed": self.data["completed"],
"MatureContent": self.data["mature"],
"Producer": "Dhanush Rambhatla (TheOnlyWayUp - https://rambhat.la) and WattpadDownloader",
} # As per https://exiftool.org/TagNames/PDF.html
action.log(f"options: {metadata}")
with ExifTool(
config_file="../exiftool.config", logger=exiftool_logger
) as et:
# Custom configuration adds Completed and MatureContent tags.
# exiftool logger logs executed command
et.execute(
*(
[f"-{key}={value}" for key, value in metadata.items()]
+ [
"-overwrite_original",
self.file.file.name,
]
)
)
def dump(self) -> BytesIO:
self.file.seek(0)
buffer = BytesIO(self.file.read())
self.file.close()
return buffer
# ------ #
+191 -67
View File
@@ -1,92 +1,216 @@
"""WattpadDownloader API Server."""
from typing import Optional
import asyncio
from pathlib import Path
from fastapi import FastAPI, HTTPException
from fastapi.responses import FileResponse, HTMLResponse, StreamingResponse
from ebooklib import epub
from create_book import (
retrieve_story,
set_cover,
set_metadata,
add_chapters,
slugify,
wp_get_cookies,
from enum import Enum
from zipfile import ZipFile
from eliot import start_action
from aiohttp import ClientResponseError
from fastapi import FastAPI, Request
from fastapi.responses import (
FileResponse,
HTMLResponse,
RedirectResponse,
StreamingResponse,
)
import tempfile
from io import BytesIO
from fastapi.staticfiles import StaticFiles
from create_book import (
EPUBGenerator,
PDFGenerator,
fetch_story,
fetch_story_from_partId,
fetch_story_content_zip,
fetch_image,
fetch_cookies,
WattpadError,
StoryNotFoundError,
generate_clean_part_html,
slugify,
logger,
)
app = FastAPI()
BUILD_PATH = Path(__file__).parent / "build"
headers = {
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36"
}
class RequestCancelledMiddleware:
# Thanks https://github.com/fastapi/fastapi/discussions/11360#discussion-6427734
def __init__(self, app):
self.app = app
async def __call__(self, scope, receive, send):
if scope["type"] != "http":
await self.app(scope, receive, send)
return
# Let's make a shared queue for the request messages
queue = asyncio.Queue()
async def message_poller(sentinel, handler_task):
nonlocal queue
while True:
message = await receive()
if message["type"] == "http.disconnect":
handler_task.cancel()
return sentinel # Break the loop
# Puts the message in the queue
await queue.put(message)
sentinel = object()
handler_task = asyncio.create_task(self.app(scope, queue.get, send))
asyncio.create_task(message_poller(sentinel, handler_task))
try:
return await handler_task
except asyncio.CancelledError:
logger.info("Cancelling task as connection closed")
app.add_middleware(RequestCancelledMiddleware)
class DownloadFormat(Enum):
pdf = "pdf"
epub = "epub"
class DownloadMode(Enum):
story = "story"
part = "part"
@app.get("/")
def home():
return FileResponse(BUILD_PATH / "index.html")
@app.get("/download/{story_id}")
async def download_book(
story_id: int,
@app.exception_handler(ClientResponseError)
def download_error_handler(request: Request, exception: ClientResponseError):
match exception.status:
case 400 | 404:
return HTMLResponse(
status_code=404,
content='This story does not exist, or has been deleted. Support is available on the <a href="https://discord.gg/P9RHC4KCwd" target="_blank">Discord</a>',
)
case 429:
# Rate-limit by Wattpad
return HTMLResponse(
status_code=429,
content='The website is overloaded. Please try again in a few minutes. Support is available on the <a href="https://discord.gg/P9RHC4KCwd" target="_blank">Discord</a>',
)
case _:
# Unhandled error
return HTMLResponse(
status_code=500,
content='Something went wrong. Yell at me on the <a href="https://discord.gg/P9RHC4KCwd" target="_blank">Discord</a>',
)
@app.exception_handler(WattpadError)
def download_wp_error_handler(request: Request, exception: WattpadError):
if isinstance(exception, StoryNotFoundError):
return HTMLResponse(
status_code=404,
content='This story does not exist, or has been deleted. Support is available on the <a href="https://discord.gg/P9RHC4KCwd" target="_blank">Discord</a>',
)
@app.get("/download/{download_id}")
async def handle_download(
download_id: int,
download_images: bool = False,
mode: DownloadMode = DownloadMode.story,
format: DownloadFormat = DownloadFormat.epub,
username: Optional[str] = None,
password: Optional[str] = None,
):
if username and not password or password and not username:
return HTMLResponse(
status_code=422,
content='Include both the username _and_ password, or neither. Support is available on the <a href="https://discord.gg/P9RHC4KCwd" target="_blank">Discord</a>',
)
if username and password:
try:
cookies = await wp_get_cookies(username=username, password=password)
except ValueError:
return HTMLResponse(
status_code=403,
content='Incorrect Username and/or Password. Support is available on the <a href="https://discord.gg/P9RHC4KCwd" target="_blank">Discord</a>',
)
else:
cookies = None
data = await retrieve_story(story_id, cookies=cookies)
book = epub.EpubBook()
try:
set_metadata(book, data)
except KeyError:
return HTMLResponse(
status_code=404,
content='Story not found. Check the ID - Support is available on the <a href="https://discord.gg/P9RHC4KCwd" target="_blank">Discord</a>',
)
await set_cover(book, data, cookies=cookies)
# print("Metadata Downloaded")
# Chapters are downloaded
async for title in add_chapters(
book, data, download_images=download_images, cookies=cookies
with start_action(
action_type="download",
download_id=download_id,
download_images=download_images,
format=format,
mode=mode,
):
# print(f"Part ({title}) downloaded")
...
if username and not password or password and not username:
logger.error(
"Username with no Password or Password with no Username provided."
)
return HTMLResponse(
status_code=422,
content='Include both the username <u>and</u> password, or neither. Support is available on the <a href="https://discord.gg/P9RHC4KCwd" target="_blank">Discord</a>',
)
# Book is compiled
temp_file = tempfile.NamedTemporaryFile(
suffix=".epub", delete=True
) # Thanks https://stackoverflow.com/a/75398222
if username and password:
# username and password are URL-Encoded by the frontend. FastAPI automatically decodes them.
try:
cookies = await fetch_cookies(username=username, password=password)
except ValueError:
logger.error("Invalid username or password.")
return HTMLResponse(
status_code=403,
content='Incorrect Username and/or Password. Support is available on the <a href="https://discord.gg/P9RHC4KCwd" target="_blank">Discord</a>',
)
else:
cookies = None
# create epub file
epub.write_epub(temp_file, book, {})
match mode:
case DownloadMode.story:
story_id = download_id
metadata = await fetch_story(story_id, cookies)
case DownloadMode.part:
story_id, metadata = await fetch_story_from_partId(download_id, cookies)
temp_file.file.seek(0)
book_data = temp_file.file.read()
cover_data = await fetch_image(
metadata["cover"].replace("-256-", "-512-")
) # Increase resolution
return StreamingResponse(
BytesIO(book_data),
media_type="application/epub+zip",
headers={
"Content-Disposition": f'attachment; filename="{slugify(data["title"])}_{story_id}_{"images" if download_images else ""}.epub"' # Thanks https://stackoverflow.com/a/72729058
},
)
match format:
case DownloadFormat.epub:
book = EPUBGenerator(metadata, cover_data)
media_type = "application/epub+zip"
case DownloadFormat.pdf:
book = PDFGenerator(metadata, cover_data)
media_type = "application/pdf"
logger.info(f"Retrieved story metadata and cover ({story_id=})")
story_zip = await fetch_story_content_zip(story_id, cookies)
archive = ZipFile(story_zip, "r")
part_contents = [
generate_clean_part_html(
part, archive.read(str(part["id"])).decode("utf-8")
)
for part in metadata["parts"]
]
async for title in book.add_chapters(
part_contents, download_images=download_images
):
...
book_buffer = book.dump()
return StreamingResponse(
book_buffer,
media_type=media_type,
headers={
"Content-Disposition": f'attachment; filename="{slugify(metadata["title"])}_{story_id}{"_images" if download_images else ""}.{format.value}"' # Thanks https://stackoverflow.com/a/72729058
},
)
@app.get("/donate")
def donate():
"""Redirect to donation URL."""
return RedirectResponse("https://buymeacoffee.com/theonlywayup")
app.mount("/", StaticFiles(directory=BUILD_PATH), "static")
@@ -95,4 +219,4 @@ app.mount("/", StaticFiles(directory=BUILD_PATH), "static")
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=80)
uvicorn.run("main:app", host="0.0.0.0", port=80, workers=16)
+54
View File
@@ -0,0 +1,54 @@
<!DOCTYPE html>
<html lang="{langcode}">
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>{book_title}</title>
<section class="fullpage">
<img src="{cover}" alt="Cover">
</section>
<div id="copyright-container">
<h1 id="copyright-notice">Copyright Notice</h1>
<h2 id="copyright-title">{book_title}</h2>
<p id="copyright-author">By {author}</p>
<div id="copyright-separator"></div>
<p id="copyright-ex-libris">Ex Libris Sapientiae</p>
<div id="copyright-separator"></div>
{copyright_image}
<p id="copyright-copyright">{statement}</p>
<p id="copyright-rights">{freedoms}</p>
<p id="copyright-printing">Printing: {printing}</p>
<p id="copyright-printing">ID: {book_id}. <a href="https://wattpad.com/story/{book_id}" target="_blank" id="copyright-link">View this Book Online</a></p>
</div>
<div id="book">
</div>
<h1>About the Author</h1>
<div id="author-container">
<div id="author-about">
{avatar}
<h2 id="author-name"><a href="https://wattpad.com/user/{username}" id="author-link">{username}</a></h2>
<hr id="author-divider">
<p id="author-bio">
{description}
</p>
</div>
</div>
</html>
+94
View File
@@ -0,0 +1,94 @@
Copyright (c) 2010, ParaType Ltd. (http://www.paratype.com/public),
with Reserved Font Names "PT Sans", "PT Serif" and "ParaType".
This Font Software is licensed under the SIL Open Font License, Version 1.1.
This license is copied below, and is also available with a FAQ at:
https://openfontlicense.org
-----------------------------------------------------------
SIL OPEN FONT LICENSE Version 1.1 - 26 February 2007
-----------------------------------------------------------
PREAMBLE
The goals of the Open Font License (OFL) are to stimulate worldwide
development of collaborative font projects, to support the font creation
efforts of academic and linguistic communities, and to provide a free and
open framework in which fonts may be shared and improved in partnership
with others.
The OFL allows the licensed fonts to be used, studied, modified and
redistributed freely as long as they are not sold by themselves. The
fonts, including any derivative works, can be bundled, embedded,
redistributed and/or sold with any software provided that any reserved
names are not used by derivative works. The fonts and derivatives,
however, cannot be released under any other type of license. The
requirement for fonts to remain under this license does not apply
to any document created using the fonts or their derivatives.
DEFINITIONS
"Font Software" refers to the set of files released by the Copyright
Holder(s) under this license and clearly marked as such. This may
include source files, build scripts and documentation.
"Reserved Font Name" refers to any names specified as such after the
copyright statement(s).
"Original Version" refers to the collection of Font Software components as
distributed by the Copyright Holder(s).
"Modified Version" refers to any derivative made by adding to, deleting,
or substituting -- in part or in whole -- any of the components of the
Original Version, by changing formats or by porting the Font Software to a
new environment.
"Author" refers to any designer, engineer, programmer, technical
writer or other person who contributed to the Font Software.
PERMISSION & CONDITIONS
Permission is hereby granted, free of charge, to any person obtaining
a copy of the Font Software, to use, study, copy, merge, embed, modify,
redistribute, and sell modified and unmodified copies of the Font
Software, subject to the following conditions:
1) Neither the Font Software nor any of its individual components,
in Original or Modified Versions, may be sold by itself.
2) Original or Modified Versions of the Font Software may be bundled,
redistributed and/or sold with any software, provided that each copy
contains the above copyright notice and this license. These can be
included either as stand-alone text files, human-readable headers or
in the appropriate machine-readable metadata fields within text or
binary files as long as those fields can be easily viewed by the user.
3) No Modified Version of the Font Software may use the Reserved Font
Name(s) unless explicit written permission is granted by the corresponding
Copyright Holder. This restriction only applies to the primary font name as
presented to the users.
4) The name(s) of the Copyright Holder(s) or the Author(s) of the Font
Software shall not be used to promote, endorse or advertise any
Modified Version, except to acknowledge the contribution(s) of the
Copyright Holder(s) and the Author(s) or with their explicit written
permission.
5) The Font Software, modified or unmodified, in part or in whole,
must be distributed entirely under this license, and must not be
distributed under any other license. The requirement for fonts to
remain under this license does not apply to any document created
using the Font Software.
TERMINATION
This license becomes null and void if any of the above conditions are
not met.
DISCLAIMER
THE FONT SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT
OF COPYRIGHT, PATENT, TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL THE
COPYRIGHT HOLDER BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
INCLUDING ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL
DAMAGES, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM
OTHER DEALINGS IN THE FONT SOFTWARE.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
+420
View File
@@ -0,0 +1,420 @@
@font-face {
font-family: 'PT Serif';
src: url('/tmp/fonts/PTSerif-Regular.ttf') format('truetype');
font-weight: 400;
font-style: normal;
}
@font-face {
font-family: 'PT Serif';
src: url('/tmp/fonts/PTSerif-Bold.ttf') format('truetype');
font-weight: 700;
font-style: normal;
}
@font-face {
font-family: 'PT Serif';
src: url('/tmp/fonts/PTSerif-Italic.ttf') format('truetype');
font-weight: 400;
font-style: italic;
}
@font-face {
font-family: 'PT Serif';
src: url('/tmp/fonts/PTSerif-BoldItalic.ttf') format('truetype');
font-weight: 700;
font-style: italic;
}
.pt-serif-regular {
font-family: "PT Serif", serif;
font-weight: 400;
font-style: normal;
}
.pt-serif-bold {
font-family: "PT Serif", serif;
font-weight: 700;
font-style: normal;
}
.pt-serif-regular-italic {
font-family: "PT Serif", serif;
font-weight: 400;
font-style: italic;
}
.pt-serif-bold-italic {
font-family: "PT Serif", serif;
font-weight: 700;
font-style: italic;
}
@page {
margin: 2cm 2cm 3cm 2cm;
size: 148mm 210mm;
}
@page :left {
@bottom-left {
content: counter(page);
position: absolute;
z-index: -1;
}
@bottom-right {
content: string(heading);
position: absolute;
z-index: -1;
}
}
@page :right {
@bottom-left {
content: string(heading);
position: absolute;
z-index: -1;
}
@bottom-right {
content: counter(page);
position: absolute;
z-index: -1;
}
}
@page full {
@bottom-right {
content: none;
}
@bottom-left {
content: none;
}
background: black;
margin: 0;
}
@page :blank {
@bottom-right {
content: none;
}
@bottom-left {
content: none;
}
}
@page clean {
@bottom-right {
content: none;
}
@bottom-left {
content: none;
}
}
html {
counter-reset: h2-counter;
font-size: 10pt;
}
body {
margin: 0;
}
p {
line-height: 2;
text-align: justify;
}
img {
display: block;
margin: 2em auto;
max-width: 70%;
}
#contents {
border-bottom: 1px dashed rgb(100,000,100);
h2 {
font-family: "PT Serif", serif;
font-weight: 400;
font-style: normal;
}
padding-top: 5px;
}
.chapter-title {
counter-increment: h2-counter;
display: flex;
flex-direction: column;
font-size: 3em;
height: 6cm;
justify-content: flex-end;
margin: 0;
string-set: heading content();
text-align: center;
font-family: "PT Serif", serif;
font-weight: 700;
font-style: normal !important;
font-size: 36px !important; /* Uniform size */
margin-bottom: 20px; /* Space below the heading */
border-bottom: 2px solid rgb(100, 100, 100); /* Black line */
padding-bottom: 10px; /* Space between text and line */
}
p {
font-size: 16px !important; /* Standardize paragraph size */
line-height: 1.6 !important; /* Improve readability */
margin: 10px 0 !important; /* Space between paragraphs */
}
.chapter-title::before {
content: "Chapter " counter(h2-counter) " ";
display: block;
font-size: 1.2rem;
font-weight: normal;
line-height: 1;
}
section {
break-after: right;
}
#contents {
page: clean;
}
#contents p {
font-size: 2em;
}
#contents ul {
display: block;
margin: 1em 0;
padding: 0;
}
#contents li {
display: block;
}
#contents a {
color: inherit;
text-decoration: none;
display: flex;
justify-content: space-between;
}
#contents a::before {
content: target-counter(attr(href), h2-counter) '. ' target-text(attr(href));
width: 100%;
}
#contents a::after {
content: target-counter(attr(href), page);
text-align: end;
}
.outro {
border-radius: 50% 50% 0 0 / 15mm 15mm 0 0;
display: block;
height: 90mm;
left: -30mm;
max-width: none;
object-fit: cover;
position: absolute;
top: 120mm;
width: 168mm;
z-index: -1;
}
.fullpage {
page: full;
}
.fullpage img {
bottom: 0;
height: 210mm;
left: 0;
margin: 0;
max-width: none;
object-fit: cover;
position: absolute;
width: 148mm;
z-index: 1;
}
.fullpage:last-child {
break-before: left;
}
a {
font-size: 0.9rem;
color: #3182ce;
text-decoration: none;
display: inline-block;
margin-top: 1rem;
/* Cross-browser transition */
-webkit-transition: all 0.2s ease;
-moz-transition: all 0.2s ease;
-o-transition: all 0.2s ease;
transition: all 0.2s ease;
}
a:hover {
text-decoration: underline;
color: #2c5282;
}
/* Container centering for older browsers */
#author-container {
position: absolute;
top: 50%;
left: 50%;
-webkit-transform: translate(-50%, -50%); /* Old WebKit */
transform: translate(-50%, -50%);
width: 90%;
max-width: 400px;
text-align: center;
}
#author-about {
padding: 20px;
/* Fallback for older browsers */
display: block;
margin: 0 auto;
}
#author-profile-picture {
width: 200px;
height: 200px;
-webkit-border-radius: 100px; /* Old WebKit */
border-radius: 100px;
margin: 0 auto 20px auto;
display: block;
}
#author-name {
font-size: 24px;
font-weight: bold;
margin: 0 0 10px 0;
padding: 0;
}
#author-link {
color: #1a202c;
text-decoration: none;
}
#author-link:hover {
color: #4a5568;
text-decoration: underline;
}
#author-divider {
width: 60px;
height: 2px;
background-color: #d1d5db;
border: none;
margin: 0 auto 20px auto;
}
#author-bio {
color: #4b5563;
line-height: 1.6;
margin: 0;
padding: 0;
}
#copyright-container {
max-width: 600px;
margin: 60px auto;
text-align: center !important;
font-family: Georgia, serif !important;
line-height: 1.6 !important;
color: #333 !important;
}
#copyright-notice {
font-size: 24px;
margin-bottom: 4px;
border-bottom: 1px solid #333;
padding-bottom: 8px;
color: #1a1a1a;
}
#copyright-title {
font-size: 28px;
margin: 24px 0 4px 0;
color: #1a1a1a;
}
#copyright-author {
font-size: 18px;
margin: 0 0 32px 0;
color: #444;
text-align: center;
}
#copyright-license-image {
margin: 20px 0;
width: 88px;
height: 31px;
display: block;
margin-left: auto;
margin-right: auto;
}
#copyright-copyright {
font-size: 16px;
margin: 16px 0;
text-align: center;
}
#copyright-rights {
font-size: 14px;
color: #666;
margin: 8px 0;
text-align: center;
}
#copyright-printing {
font-size: 14px;
color: #666;
margin: 8px 0;
text-align: center;
}
#copyright-separator {
width: 100%;
max-width: 400px;
height: 1px;
background: #e2e8f0;
position: relative;
margin: 2rem 1rem;
/* Gradient fallback */
background: -webkit-gradient(linear, left top, right top, from(transparent), color-stop(#718096), to(transparent));
background: -webkit-linear-gradient(left, transparent, #718096, transparent);
background: -moz-linear-gradient(left, transparent, #718096, transparent);
background: -o-linear-gradient(left, transparent, #718096, transparent);
background: linear-gradient(to right, transparent, #718096, transparent);
}
#copyright-ex-libris {
font-size: 1.5rem;
font-style: italic;
color: #4a5568;
margin: 2rem 0;
text-align: center;
}
#copyright-link {
font-size: 14px;
}
+120
View File
@@ -0,0 +1,120 @@
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:outline="http://wkhtmltopdf.org/outline"
xmlns="http://www.w3.org/1999/xhtml">
<xsl:output doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
indent="yes" />
<xsl:template match="outline:outline">
<html>
<head>
<style>
@font-face {
font-family: 'PT Serif';
src: url('./fonts/PTSerif-Regular.ttf') format('truetype');
font-weight: 400;
font-style: normal;
}
@font-face {
font-family: 'PT Serif';
src: url('./fonts/PTSerif-Bold.ttf') format('truetype');
font-weight: 700;
font-style: normal;
}
@font-face {
font-family: 'PT Serif';
src: url('./fonts/PTSerif-Italic.ttf') format('truetype');
font-weight: 400;
font-style: italic;
}
@font-face {
font-family: 'PT Serif';
src: url('./fonts/PTSerif-BoldItalic.ttf') format('truetype');
font-weight: 700;
font-style: italic;
}
.pt-serif-regular {
font-family: "PT Serif", serif;
font-weight: 400;
font-style: normal;
}
.pt-serif-bold {
font-family: "PT Serif", serif;
font-weight: 700;
font-style: normal;
}
.pt-serif-regular-italic {
font-family: "PT Serif", serif;
font-weight: 400;
font-style: italic;
}
.pt-serif-bold-italic {
font-family: "PT Serif", serif;
font-weight: 700;
font-style: italic;
}
h1 {
text-align: center;
font-family: "PT Serif", serif !important;
font-weight: 700 !important;
font-style: normal !important;
font-size: 36px !important; /* Uniform size */
margin-bottom: 20px; /* Space below the heading */
border-bottom: 4px solid black; /* Black line */
padding-bottom: 10px; /* Space between text and line */
}
div {border-bottom: 1px dashed rgb(100,000,100);
padding-top: 5px;}
span {float: right;}
li {list-style: none;}
ul {
font-size: 22px;
font-family: arial;
}
ul ul {font-size: 80%; }
ul {padding-left: 0em;}
ul ul {padding-left: 1em;}
a {text-decoration:none; color: black;}
</style>
</head>
<body>
<h1>Table of Contents</h1>
<ul><xsl:apply-templates select="outline:item/outline:item"/></ul>
</body>
</html>
</xsl:template>
<xsl:template match="outline:item">
<li>
<xsl:if test="@title!=''">
<div>
<a class="pt-serif-regular">
<xsl:if test="@link">
<xsl:attribute name="href"><xsl:value-of select="@link"/></xsl:attribute>
</xsl:if>
<xsl:if test="@backLink">
<xsl:attribute name="name"><xsl:value-of select="@backLink"/></xsl:attribute>
</xsl:if>
<xsl:value-of select="@title" />
</a>
<span> <xsl:value-of select="@page" /> </span>
</div>
</xsl:if>
<ul>
<xsl:comment>added to prevent self-closing tags in QtXmlPatterns</xsl:comment>
<xsl:apply-templates select="outline:item"/>
</ul>
</li>
</xsl:template>
</xsl:stylesheet>
+1707
View File
File diff suppressed because it is too large Load Diff
+3 -5
View File
@@ -7,26 +7,24 @@
<title>Wattpad Downloader</title>
<meta name="title" content="Wattpad Downloader" />
<meta name="description" content="Read your way, download Wattpad Books as EPUBs in seconds. Have an Ad-Free experience with Unlimited Offline Reading. Try it now!" />
<meta name="description" content="Read your way, download Wattpad Books as PDFs or EPUBs in seconds. Have an Ad-Free experience with Unlimited Offline Reading. Try it now!" />
<!-- Open Graph / Facebook -->
<meta property="og:type" content="website" />
<meta property="og:url" content="https://wpd.rambhat.la/" />
<meta property="og:title" content="Wattpad Downloader" />
<meta property="og:description" content="Read your way, download Wattpad Books as EPUBs in seconds. Have an Ad-Free experience with Unlimited Offline Reading. Try it now!" />
<meta property="og:description" content="Read your way, download Wattpad Books as PDFs or EPUBs in seconds. Have an Ad-Free experience with Unlimited Offline Reading. Try it now!" />
<meta property="og:image" content="https://wpd.rambhat.la/embed.png" />
<!-- Twitter -->
<meta property="twitter:card" content="summary_large_image" />
<meta property="twitter:url" content="https://wpd.rambhat.la/" />
<meta property="twitter:title" content="Wattpad Downloader" />
<meta property="twitter:description" content="Read your way, download Wattpad Books as EPUBs in seconds. Have an Ad-Free experience with Unlimited Offline Reading. Try it now!" />
<meta property="twitter:description" content="Read your way, download Wattpad Books as PDFs or EPUBs in seconds. Have an Ad-Free experience with Unlimited Offline Reading. Try it now!" />
<meta property="twitter:image" content="https://wpd.rambhat.la/embed.png" />
<!-- Meta Tags Generated with https://metatags.io -->
<script defer src="https://feedback.fish/ff.js?pid=f8df016d4ffdfb"></script>
%sveltekit.head%
</head>
<body data-sveltekit-preload-data="hover">
+4 -4
View File
@@ -16,17 +16,17 @@
class="footer footer-center p-4 bg-base-300 text-base-content bottom-0 fixed"
>
<aside>
<div class="grid grid-cols-3 max-w-lg w-full">
<div class="flex flex-row max-w-lg w-full">
<a
href="https://liberapay.com/TheOnlyWayUp/"
href="/donate"
target="_blank"
class="link"
data-umami-event="Footer Donate">Donate</a
data-umami-event="Footer Donate">Buy me a Coffee!</a
>
<a
href="https://rambhat.la"
target="_blank"
class="link"
class="link flex-1"
data-umami-event="Footer AboutMe">About Me</a
>
<a
+169 -38
View File
@@ -1,32 +1,92 @@
<script>
let story_id = "";
let download_images = false;
let download_as_pdf = false; // 0 = epub, 1 = pdf
let is_paid_story = false;
let invalid_url = false;
let after_download_page = false;
let credentials = {
username: "",
password: "",
};
let after_download_page = false;
let url = "";
let download_id = "";
let mode = "";
let input_url = "";
let button_disabled = false;
$: button_disabled =
!story_id ||
!input_url ||
(is_paid_story && !(credentials.username && credentials.password));
$: url =
`/download/${story_id}?om=1` +
`/download/` +
download_id +
`?om=1` +
(download_images ? "&download_images=true" : "") +
(is_paid_story
? `&username=${credentials.username}&password=${credentials.password}`
: "");
? `&username=${encodeURIComponent(credentials.username)}&password=${encodeURIComponent(credentials.password)}`
: "") +
`&mode=${mode}` +
(download_as_pdf ? "&format=pdf" : "&format=epub");
$: {
if (input_url.length) {
input_url = input_url.toLowerCase();
invalid_url = false;
if (/^\d+$/.test(input_url)) {
// All numbers
download_id = input_url;
mode = "story";
} else if (input_url.includes("wattpad.com/")) {
// Is a string and contains contain wattpad.com/
if (input_url.includes("/story/")) {
// https://wattpad.com/story/237369078-wattpad-books-presents
input_url = input_url.split("-")[0].split("?")[0].split("/story/")[1]; // removes tracking fields and title
download_id = input_url;
mode = "story";
} else if (input_url.includes("/stories/")) {
// https://www.wattpad.com/api/v3/stories/237369078?fields=...
input_url = input_url.split("?")[0].split("/stories/")[1]; // removes params
download_id = input_url;
mode = "story";
} else {
// https://www.wattpad.com/939051741-wattpad-books-presents-the-qb-bad-boy-and-me
input_url = input_url
.split("-")[0]
.split("?")[0]
.split("wattpad.com/")[1]; // removes tracking fields and title
download_id = input_url;
if (/^\d+$/.test(download_id)) {
// If "wattpad.com/{download_id}" contains only numbers
mode = "part";
} else {
invalid_url = true;
input_url = "";
download_id = "";
}
}
} else {
invalid_url = true;
}
input_url = input_url.match(/\d+/g)?.join("") || "";
download_id = input_url;
// Originally, I was going to call the Wattpad API (wattpad.com/api/v3/stories/${story_id}), but Wattpad kept blocking those requests. I suspect it has something to do with the Origin header, I wasn't able to remove it.
// In the future, if this is considered, it would be cool if we could derive the Story ID from a pasted Part URL. Refer to @AaronBenDaniel's https://github.com/AaronBenDaniel/WattpadDownloader/blob/49b29b245188149f2d24c0b1c59e4c7f90f289a9/src/api/src/create_book.py#L156 (https://www.wattpad.com/api/v3/story_parts/{part_id}?fields=url).
} else {
invalid_url = false;
download_id = "";
}
}
</script>
<div>
<div class="hero min-h-screen">
<div
class="hero-content flex-col lg:flex-row-reverse bg-base-100/50 p-16 rounded shadow-sm"
class="hero-content flex-col lg:flex-row-reverse bg-base-100/50 lg:p-16 py-32 rounded shadow-sm"
>
{#if !after_download_page}
<div class="text-center lg:text-left lg:p-10">
@@ -35,32 +95,78 @@
>
Wattpad Downloader
</h1>
<!-- <div role="alert" class="alert bg-cyan-300 mt-5">
<svg
xmlns="http://www.w3.org/2000/svg"
fill="none"
viewBox="0 0 24 24"
class="h-6 w-6 shrink-0 stroke-current"
>
<path
stroke-linecap="round"
stroke-linejoin="round"
stroke-width="2"
d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z"
></path>
</svg>
<span class="text-lg">Please Donate</span>
</div> -->
<p class="pt-6 text-lg">
Download your favourite books with a single click!
</p>
<ul class="pt-4 list list-inside text-xl">
<li>06/24 - 🎉 Image Downloading!</li>
<!-- TODO: 'max-lg: hidden' to hide on screen sizes smaller than lg. I'll do this when I figure out how to make this show up _below_ the card on smaller screen sizes. -->
<li>12/24 - ⚡ Super-fast Downloads!</li>
<li>12/24 - 📑 PDF Downloads!</li>
<li>12/24 - 📂 Improved Performance</li>
<li>11/24 - 🔗 Paste Links!</li>
<li>11/24 - 📨 Send to Kindle Support!</li>
<li>11/24 - ⚒️ Fix Image Downloads</li>
<li>
10/24 - 👾 Add the <a
href="https://discord.com/oauth2/authorize?client_id=1292173380065296395&permissions=274878285888&scope=bot%20applications.commands"
target="_blank"
class="link underline">Discord Bot</a
>!
</li>
<li>07/24 - 🔡 RTL Language support! (Arabic, etc.)</li>
<li>06/24 - 🔑 Authenticated Downloads!</li>
<li>06/24 - 🖼️ Image Downloading!</li>
</ul>
</div>
<div class="card shrink-0 w-full max-w-sm shadow-2xl bg-base-100">
<form class="card-body">
<div class="form-control">
<input
type="number"
placeholder="Story ID"
type="text"
placeholder="Story URL"
class="input input-bordered"
bind:value={story_id}
class:input-warning={invalid_url}
bind:value={input_url}
required
name="story_id"
name="input_url"
/>
<label class="label" for="story_id">
<button
class="label-text link font-semibold"
onclick="StoryIDTutorialModal.showModal()"
data-umami-event="StoryIDTutorialModal Open"
>How to get a Story ID</button
>
<label class="label" for="input_url">
{#if invalid_url}
<p class=" text-red-500">
Refer to (<button
class="link font-semibold"
onclick="StoryURLTutorialModal.showModal()"
data-umami-event="Part StoryURLTutorialModal Open"
>How to get a Story URL</button
>).
</p>
{:else}
<button
class="label-text link font-semibold"
onclick="StoryURLTutorialModal.showModal()"
data-umami-event="StoryURLTutorialModal Open"
>How to get a Story URL</button
>
{/if}
</label>
<label class="cursor-pointer label">
<span class="label-text"
>This is a Paid Story, and I've purchased it</span
@@ -99,13 +205,25 @@
<div class="form-control mt-6">
<a
class="btn btn-primary rounded-l-none"
class="btn rounded-l-none"
class:btn-primary={!download_as_pdf}
class:btn-secondary={download_as_pdf}
class:btn-disabled={button_disabled}
data-umami-event="Download"
href={url}
on:click={() => (after_download_page = true)}>Download</a
>
<label class="swap w-fit label mt-2">
<input type="checkbox" bind:checked={download_as_pdf} />
<div class="swap-on">
Downloading as <span class=" underline text-bold">PDF</span> (Click)
</div>
<div class="swap-off">
Downloading as <span class=" underline text-bold">EPUB</span> (Click)
</div>
</label>
<label class="cursor-pointer label">
<span class="label-text"
>Include Images (<strong>Slower Download</strong>)</span
@@ -151,7 +269,21 @@
>, where we release features early and discuss updates.
</p>
</div>
<a href="/" class="btn btn-outline btn-lg mt-10">Download More</a>
<div class="grid justify-center grid-rows-2 gap-y-10">
<a
href="/donate"
target="_blank"
class="btn bg-cyan-200 btn-lg mt-10 hover:bg-green-200"
>Buy me a Coffee! 🍵</a
>
<button
on:click={() => {
after_download_page = false;
input_url = "";
}}
class="btn btn-outline btn-lg">Download More</button
>
</div>
</div>
{/if}
</div>
@@ -160,32 +292,31 @@
<!-- Open the modal using ID.showModal() method -->
<dialog id="StoryIDTutorialModal" class="modal">
<dialog id="StoryURLTutorialModal" class="modal">
<div class="modal-box">
<form method="dialog">
<button class="btn btn-sm btn-circle btn-ghost absolute right-2 top-2"
></button
>
</form>
<h3 class="font-bold text-lg">Downloading a Story</h3>
<ol class="list list-disc list-inside py-4 space-y-2">
<h3 class="font-bold text-lg">Finding the Story URL</h3>
<ol class="list list-disc list-inside py-4 space-y-4">
<li>
Open the Story URL (For example, <span
class="font-mono bg-slate-100 p-1"
>wattpad.com/story/237369078-wattpad-books-presents</span
>)
Copy the URL from the Website, or hit share and copy the URL on the App.
</li>
<li>
Copy the numbers after the <span class="font-mono bg-slate-100 p-1"
>/</span
>
(In the example, that'd be,
For example,
<span class="font-mono bg-slate-100 p-1"
>wattpad.com/story/<span class="bg-amber-200 p-1">237369078</span
>-wattpad-books-presents</span
>)
>wattpad.com/<span class="bg-amber-200 rounded-sm">story</span
>/237369078-wattpad-books-presents</span
>.
</li>
<li>Paste the Story ID and hit Download!</li>
<li>
<span class="font-mono bg-slate-100 p-1"
>https://www.wattpad.com/939103774-given</span
> is okay too.
</li>
<li>Paste the URL and hit Download!</li>
</ol>
</div>
<form method="dialog" class="modal-backdrop">