190 Commits

Author SHA1 Message Date
Nikhil Tanwar
d8656ec149 Introduce HTMLDumper
HTMLDumper class will be used to dump library in HTML format. It inherits from LibraryDumper
2023-03-28 20:25:44 +05:30
Veloman Yunkan
2550306052 One more usage of Book::getLanguages()
`Book::getLanguages()` is used instead of `Book::getLanguage()` when
determining the set of languages for a collection of books.
2023-03-08 15:24:53 +01:00
Veloman Yunkan
ac742e9da2 Redirection of slashless root URL
With non-empty root location, the canonic form of the root URL for a
kiwix server is now required to end with a slash (to match the situation
for an empty root location). This requirement enables usage of relative
URLs on the welcome page and resources/scripts loaded through that page.

A slashless root URL is redirected to the slashful version.
2023-02-22 17:54:20 +04:00
Veloman Yunkan
2e0124710a ?count=0 OPDS catalog queries return 0 results
... which is a useful way of finding out the total number of results
with the least consumption of resources.
2023-02-10 19:15:29 +01:00
Veloman Yunkan
c2fffacbbd Renamed a data member 2023-02-09 10:40:23 +01:00
Veloman Yunkan
05a66ead6e URI-encoding of the root location part
Now the root location is URI-encoded too.

In order to properly test this change the root location in the tests was
changed from "/ROOT" to "/ROOT#?" (or "/ROOT%23%3F" in URI-encoded form),
which is why this commit is so big.
2023-02-09 10:40:07 +01:00
Veloman Yunkan
97f0314fe6 Saving a few CPU cycles
This silly optimization in fact helps to avoid a somewhat more serious
waste of CPU cycles that would otherwise result in the next commit.
2023-02-08 22:16:27 +01:00
Veloman Yunkan
a7fe4193e3 Preparing to save a few CPU cycles 2023-02-08 22:16:27 +01:00
Veloman Yunkan
2c5e84b6b3 Simpler fullURL2LocalURL() 2023-02-08 22:16:27 +01:00
Veloman Yunkan
71a66e0528 Passing of unrooted URL into RequestContext()
This change doesn't make much sense on its own - the real goal is to
prepare some ground for easier implementation of URI-encoding of the root
location.
2023-02-08 22:16:27 +01:00
Veloman Yunkan
a807ce27f1 URI-encoding when redirecting legacy URLs to /content
Testing of this functionality revealed that the query part containing +
symbols (as replacement for spaces in the parameter values) isn't
forwarded properly as the + symbols are URI-encoded (this is a bug on
the part of the `RequestContext::get_query()` the result of which
already contains URI-encoded +'s).
2023-02-08 22:16:27 +01:00
Veloman Yunkan
2e9bec95b0 Proper URI-encoding in InternalServer::build_redirect()
- Before this change `InternalServer::build_redirect()` only URI-encoded the
  article path, ignoring the book name and/or the root location components of
  the URL.

- In order to be able to test this fix, corner_cases.zim was renamed to
  contain a couple of special URL symbols in its filename. The
  `create_corner_cases_zim_file` script was updated accordingly.
2023-02-08 22:16:09 +01:00
Veloman Yunkan
471c5b89f4 Dropped the 2nd param of urlEncode()
`urlEncode(str)` is now equivalent to the previous `urlEncode(str, true)`.
2023-01-25 19:15:12 +04:00
Veloman Yunkan
8eb527389e URI-encoding of redirections to URLs with special symbols 2023-01-10 17:41:59 +04:00
Veloman Yunkan
abcd4ade99 kiwix::Suggestions::getJSON() 2022-11-17 11:51:53 +04:00
Veloman Yunkan
7a9780eb90 kiwix::Suggestions::addFTSearchSuggestion() 2022-11-17 11:51:53 +04:00
Veloman Yunkan
51bd881211 kiwix::Suggestions::add() 2022-11-17 11:51:53 +04:00
Veloman Yunkan
f36f1661d5 Got rid of result count tracker variable 2022-11-17 11:51:53 +04:00
Veloman Yunkan
18f4a58237 Conception of kiwix::Suggestions 2022-11-17 11:51:53 +04:00
Veloman Yunkan
c87add1419 Removed an unused variable 2022-11-01 19:16:30 +01:00
Veloman Yunkan
9409e8bd91 Preventing confusion of tongues in multizim search
Multizim search requires that all selected books be in the same
language.

No new URL query parameter was introduced for specifying the intended
search language - `books.filter.lang` can be used for that purpose.

The server_search unit-test was updated to use a slightly cheating
library xml file where the language of example.zim was tweaked from "en"
to "eng" in order to match that of zimfile.zim. Note that this change
drops from the tested server two other goofy ZIM files corner_cases.zim
and poor.zim that have been/are included in ServerTest.
2022-10-31 13:27:57 +04:00
Veloman Yunkan
cd62b5dd91 Some clean-up 2022-10-31 13:22:15 +04:00
Veloman Yunkan
414d7ae4fe Fixed indentation 2022-10-31 13:22:15 +04:00
Veloman Yunkan
9d2cc35447 Extracted InternalServer::handle_search_request() 2022-10-31 13:22:15 +04:00
Matthieu Gautier
e5b94fa1bb Make the opds_dumper respect the provided nameMapper used in the server.
Fix #828
2022-10-30 19:21:01 +01:00
Veloman Yunkan
b9f60ecfe9 Handling of cacheid when serving static resources
During static resource preprocessing and compilation their cacheid
values are embedded into libkiwix and can be accessed at runtime.

If a static resource is requsted without specifying any cacheid
it is served as dynamic content (with short TTL and the library id
used for the ETag, though using the cacheid for the ETag would
be better).

If a cacheid is supplied in the request it must match the cacheid of the
resource (otherwise a 404 Not Found error is returned) whereupon the
resource is served as immutable content.

Known issues:

- One issue is caused by the fact that some static resources don't get a
  cacheid; this is resolved in the next commit.

- Interaction of this change with the support for dynamically customizing
  static resources (via KIWIX_SERVE_CUSTOMIZED_RESOURCES env var) was
  not addressed.
2022-10-19 19:26:04 +04:00
Veloman Yunkan
9fd1423100 Small clean-up 2022-10-19 19:26:04 +04:00
Veloman Yunkan
6b8d6232f0 InternalServer::getLibraryId() 2022-10-19 19:26:02 +04:00
Veloman Yunkan
c91df1cb26 Two private funcs of InternalServer became free 2022-10-19 19:21:28 +04:00
Veloman Yunkan
b249edee60 ETags for ZIM content use the ZIM file UUID 2022-10-19 19:21:28 +04:00
Veloman Yunkan
190156e095 Setting Cache-Control: for three types of content
At this point the ETag value for ZIM content is still generated from the
timestamp of the server start-up time.
2022-10-19 19:21:28 +04:00
Veloman Yunkan
73191fb8f8 Made the /suggest endpoint concurrency-safe 2022-10-13 13:39:25 +04:00
Veloman Yunkan
582c8d868a New logic for generating HTTP-redirects
Before this fix the root URL for a book was assumed to resolve to the
main page.  This was not true for ZIM files containing an entry at an
empty path or with a path equal to "/", resulting in issue #826. The
logic behind this behaviour is found in `kiwix::getEntryFromPath()`.

The fix to that issue is a little more general and will result in an
HTTP redirect in any case where `kiwix::getEntryFromPath(zim, path)`
returns an entry with a real path different from the requested one. In
particular, this will affect the behaviour on ZIM files with the old
namespace scheme, where the requested resource - if not found - is also
looked up in the 'A', 'I', 'J', and/or '-' namespaces. Now instead of
returning the contents of that other resource an HTTP redirect response
will be sent.
2022-10-04 14:18:08 +04:00
Veloman Yunkan
60148717e1 Fixed search results for kiwix-desktop 2022-09-26 13:11:25 +04:00
Veloman Yunkan
cac2d212c6 Respecting the --nosearchbar option of kiwix-serve
If `kiwix-serve` is run with the `--nosearchbar` option the toolbar is
disabled (hidden) in its viewer.

Note however that certain actions performed by the viewer merely with
the purpose of keeping the toolbar up-to-date are still carried out.
2022-09-21 15:41:40 +04:00
Veloman Yunkan
da23e4eca4 Revert "Partly respecting the kiwix-serve --nosearchbar option"
This reverts commit 436d890893713c5eb98df6893d0e0b41b22e2472.
2022-09-21 15:41:40 +04:00
Veloman Yunkan
2be9ac342f Partly respecting the kiwix-serve --nosearchbar option
`--nosearchbar` option of `kiwix-serve` (despite its misleading name)
was used to disable the entire taskbar. This commit accounts for the
existence of that option only partially:

1. Links to books on the welcome/library page are affected - by default
   books are displayed in the viewer, but in a kiwix-serve instance run
   with --nosearchbar books are loaded in the top window.

2. The `/viewer` endpoint is enabled unconditionally, so if anyone
   enters the viewer URL in the address bar they will see books in the
   viewer.
2022-09-21 15:41:40 +04:00
Veloman Yunkan
369406fb5d Viewer settings
Made the viewer respect the `--blockexternal` and `--nolibrarybutton`
options of `kiwix-serve`. Those options are passed to the viewer
via the dynamically generated resource `/viewer_settings.js`.
2022-09-21 15:41:40 +04:00
Veloman Yunkan
b81cb3a8e9 Got rid of raw mode in response generation 2022-09-21 15:41:40 +04:00
Veloman Yunkan
0ce36e6246 Got rid of isHomePage in ContentResponse::build() 2022-09-21 15:41:40 +04:00
Veloman Yunkan
eb0a45b13e Undefaulted bool params of ContentResponse::build()
This resulted in compiler aided discovery of all call sites where the
default values were used. For OPDS/catalog requests now passing true for the
`raw` parameter, since XML content isn't supposed to undergo any
transformations.
2022-09-21 15:41:40 +04:00
Veloman Yunkan
c988511561 Removed unused param from ContentResponse::build()
Removed the isHomePage param from one of the variants of
`ContentResponse::build()`. The other overload is dangerous since
failing to review&update all of its call site may result in changed
semantics. Will do it in a couple of separate commits.
2022-09-21 15:41:40 +04:00
Veloman Yunkan
0cf4850a9b Dropped TaskbarInfo 2022-09-21 15:41:40 +04:00
Veloman Yunkan
4db443eca6 Embryo of iframe-based viewer 2022-09-21 15:41:40 +04:00
Emmanuel Engelhart
1062bd73a3
It's libkiwix, not kiwixlib 2022-09-11 16:05:25 +02:00
Veloman Yunkan
e323dcf6c9 Redirecting /nonendpoint URLs to /content/nonendpoint 2022-08-11 18:04:05 +04:00
Veloman Yunkan
3b98987cb3 More robust handling of endpoint URLs
The next goal is to redirect old-style /book/path/to/entry URLs to
/content/book/path/to/entry, which seemed pretty trivial.

However, given the current handling of some endpoint URLs, more work was
required to ensure that invalid endpoint URLs (e.g.  "/random/number" or
"/suggest/fr") are not interpreted as content URLs. Previously, that was
not a user-observable issue, since the result would be an immediate 404
error (except in certain edge cases, like handling the request for
"/random/number" when there is a book with name "random" containing an
article at path "/number"). With redirection of URLs that were assumed
to refer to content a 404 error would be issued for the
transformed URL ("/content/random/number") which may be confusing.

Therefore this change is to ensure the correct routing of endpoint URL
handling.
2022-08-11 18:04:05 +04:00
Veloman Yunkan
1b1c1e352e Introduced /content endpoint
Book content is now served under /content/book/...

The old access to book content via a top-level URL /book/... is so far
preserved for backward compatibility.

Redirects were changed to use the new URL scheme. Links in the search results
still use the old scheme.
2022-08-11 18:04:05 +04:00
Veloman Yunkan
a4b18893aa Moved handling of the "/" URL 2022-08-11 18:04:05 +04:00
Matthieu Gautier
69931fb347
Remove libzim's wrapper.
It is time to remove them. They are deprecated since 10.0.0
2022-07-02 16:33:32 +02:00