23 Commits
v0.1 ... v0.2

Author SHA1 Message Date
69339abe24 fix the way repo name label is handled 2018-08-01 00:25:29 -07:00
8d2718d783 update how we store totals 2018-07-31 23:58:19 -07:00
8912b945fe remove print statement 2018-07-31 23:17:16 -07:00
ddceb16a2c fix template rendering in update_index url endpoint 2018-07-31 23:16:45 -07:00
f769d18b4e clean up flask config file 2018-07-31 23:16:23 -07:00
34a889479a Update config_flask.py 2018-07-31 23:12:57 -07:00
a074e6c0e7 add image to readme 2018-07-31 23:07:32 -07:00
918c9d583f update search results template 2018-07-31 23:01:38 -07:00
6cd505087b package up the counts in get_document_total_count 2018-07-31 22:37:20 -07:00
ee9b3bb811 pass a count dictionary instead of an integer to the jinja template 2018-07-31 22:36:43 -07:00
8a4e20b71c update template - gotta look good 2018-07-31 22:36:13 -07:00
64d3ce4a9b update search engine style to use centillion logo 2018-07-31 18:29:01 -07:00
5e9b584d26 uncovered the mysterious missing google docs: they were just being labeled as issues by the search template. 2018-07-31 15:59:21 -07:00
b03a42d261 start some troubleshooting 2018-07-31 05:21:58 -07:00
bd4f4da8dc more fixes - use "" not None 2018-07-31 05:15:22 -07:00
23743773a6 add mkdocs-material submodule 2018-07-31 04:33:27 -07:00
b7d2a8c960 rename some files, and move docs into docs/ 2018-07-31 04:32:38 -07:00
1f4b43163a fix env var name 2018-07-31 03:16:28 -07:00
f80ccc2520 successfully indexing, unsuccessfully searching 2018-07-31 03:06:25 -07:00
c2eae4f521 improve handling of repo names, oweners, and document schema. improve timestamps. 2018-07-31 01:52:44 -07:00
c758ca7a6c add quickstart 2018-07-31 01:28:38 -07:00
3cf142465a updating readme with flask mention 2018-07-31 01:23:49 -07:00
bfd351c990 Update 'Workdone.md' 2018-07-31 08:12:28 +00:00
21 changed files with 454 additions and 310 deletions

3
.gitmodules vendored Normal file
View File

@@ -0,0 +1,3 @@
[submodule "mkdocs-material"]
path = mkdocs-material
url = https://git.charlesreid1.com/charlesreid1/mkdocs-material.git

View File

@@ -1,4 +1,4 @@
# centillion
# The Centillion
**the centillion**: a pan-github-markdown-issues-google-docs search engine.
@@ -6,62 +6,40 @@
the centillion is 3.03 log-times better than the googol.
![Screen shot of centillion](img/ss.png)
## what is it
The centillion is a search engine built using [whoosh](#),
The centillion is a search engine built using [whoosh](https://whoosh.readthedocs.io/en/latest/intro.html),
a Python library for building search engines.
We define the types of documents the centillion should index,
and how, using what fields. The centillion then builds and
updates a search index.
what info and how. The centillion then builds and
updates a search index. That's all done in `centillion_search.py`.
The centillion also provides a simple web frontend for running
queries against the search index.
queries against the search index. That's done using a Flask server
defined in `centillion.py`.
The centillion keeps it simple.
## work that is done
## quickstart
See [Workdone.md](Workdone.md)
Run the centillion app with a github access token API key set via
environment variable:
```
GITHUB_TOKEN="XXXXXXXX" python centillion.py
```
This will start a Flask server, and you can view the minimal search engine
interface in your browser at <http://localhost:5000>.
## more info
For more info see the documentation: <https://charlesreid1.github.io/centillion>
## work that is being done
See [Workinprogress.md](Workinprogress.md) for details about
route and function layout. Summary below.
### code organization
centillion app routes:
- home
- if not logged in, landing page
- if logged in, redirect to search
- search
- main_index_update
- update main index, all docs period
centillion Search functions:
- open_index creates the schema
- add_issue, add_md, add_document have three diff method sigs and add diff types
of documents to the search index
- update_all_issues or update_all_md or update_all_documents iterates over items
and determines whether each item needs to be updated in the search index
- update_main_index - update the entire search index
- calls all three update_all methods
- create_search_results - package things up for jinja
- search - run the query, pass results to the jinja-packager
## work that is planned
See [Workplanned.md](Workplanned.md)

7
Todo.md Normal file
View File

@@ -0,0 +1,7 @@
# todo
current problems:
- some github issues have no title
- github issues are just being re-indexed over and over
- documents not showing up in results

View File

@@ -1,106 +0,0 @@
# Components
The components of centillion are as follows:
- Flask application, which creates a Search object and uses it to search index
- Search object, which allows you to create/update/search an index
## Routes layout
Current application routes are as follows:
- home -> search
- search
- update_index
Ideal application routes (using github flask dance oauth):
- home
- if not logged in, landing page
- if logged in, redirect to search
- search
- main_index_update
- update main index, all docs period
- delta_index_update
- updates delta index, docs that have changed since last main index
There should be one route to update the main index
There should be another route to update the delta index
These should go off and call the update index methods
for each respective type of document/collection.
For example, if I call `main_index_update` route it should
- call `main_index_update` for all github issues
- call `main_index_update` for folder of markdown docs
- call `main_index_update` for google drive folder
These are all members of the Search class
## Functions layout
Functions of the entire search app:
- create a search index
- load a search index
- call the search() method on the index
- update the search index
The first and last, creating and updating the search index,
are of greatest interest.
The Schema affects everything so it is hard to separate
functionality into a main Search class shared by many.
(Avoid inheritance/classes if possible.)
current Search:
- open_index creates the schema
- add_issue or add_document adds an item to the index
- add_all_issues or add_all_documents iterates over items and adds them to index
- update_index_incremental - update the search index
- create_search_results - package things up for jinja
- search - run the query, pass results to the jinja-packager
centillion Search:
- open_index creates the schema
- add_issue, add_md, add_document have three diff method sigs and add diff types
of documents to the search index
- update_all_issues or update_all_md or update_all_documents iterates over items
and determines whether each item needs to be updated in the search index
- update_main_index - update the entire search index
- calls all three update_all methods
- create_search_results - package things up for jinja
- search - run the query, pass results to the jinja-packager
Nice to have but focus on it later:
- update_diff_issues or update_diff_md or update_diff_documents iterates over items
and indexes recently-added items
- update_diff_index - update the diff search index (what's been added since last
time)
- calls all three update_diff methods
## Files layout
Schema definition:
* include a "kind" or "class" to group objects
* can provide different searches of different collections
* eventually can provide user with checkboxes

View File

@@ -38,7 +38,7 @@ class UpdateIndexTask(object):
from get_centillion_config import get_centillion_config
config = get_centillion_config('config_centillion.json')
gh_token = os.environ['GITHUB_ACESS_TOKEN']
gh_token = os.environ['GITHUB_TOKEN']
search.update_index_issues(gh_token, config)
search.update_index_gdocs(config)
@@ -76,25 +76,26 @@ def search():
parsed_query, result = search.search(query.split(), fields=[fields])
store_search(query, fields)
total = search.get_document_total_count()
totals = search.get_document_total_count()
return render_template('search.html', entries=result, query=query, parsed_query=parsed_query, fields=fields, last_searches=get_last_searches(), total=total)
@app.route('/open')
def open_file():
path = request.args['path']
fields = request.args.get('fields')
query = request.args['query']
call([app.config["EDIT_COMMAND"], path])
return redirect(url_for("search", query=query, fields=fields))
return render_template('search.html',
entries=result,
query=query,
parsed_query=parsed_query,
fields=fields,
last_searches=get_last_searches(),
totals=totals)
@app.route('/update_index')
def update_index():
rebuild = request.args.get('rebuild')
UpdateIndexTask(diff_index=False)
flash("Rebuilding index, check console output")
return render_template("search.html", query="", fields="", last_searches=get_last_searches())
return render_template("search.html",
query="",
fields="",
last_searches=get_last_searches(),
totals={})
##############

View File

@@ -14,6 +14,8 @@ import tempfile, subprocess
import pypandoc
import os.path
import codecs
from datetime import datetime
from whoosh.qparser import MultifieldParser, QueryParser
from whoosh.analysis import StemmingAnalyzer
@@ -57,6 +59,10 @@ Schema:
"""
def clean_timestamp(dt):
return dt.replace(microsecond=0).isoformat()
class SearchResult:
score = 1.0
path = None
@@ -115,7 +121,7 @@ class Search:
schema = Schema(
id = ID(stored=True, unique=True),
kind = ID(),
kind = ID(stored=True),
created_time = ID(stored=True),
modified_time = ID(stored=True),
@@ -172,10 +178,11 @@ class Search:
'document' : 'docx',
}
content = ""
if(mimetype not in mimemap.keys()):
# Not a document -
# Just a file
print("Indexing document %s of type %s"%(item['name'], mimetype))
print("Indexing document \"%s\" of type %s"%(item['name'], mimetype))
else:
# Document with text
# Perform content extraction
@@ -187,7 +194,7 @@ class Search:
# This is a file type we know how to convert
# Construct the URL and download it
print("Extracting content from %s of type %s"%(item['name'], mimetype))
print("Extracting content from \"%s\" of type %s"%(item['name'], mimetype))
# Create a URL and a destination filename
@@ -227,7 +234,7 @@ class Search:
)
assert output == ""
except RuntimeError:
print("XXXXXX Failed to index document %s"%(item['name']))
print("XXXXXX Failed to index document \"%s\""%(item['name']))
# If export was successful, read contents of markdown
@@ -240,7 +247,7 @@ class Search:
# No matter what happens, clean up.
print("Cleaning up %s"%item['name'])
print("Cleaning up \"%s\""%item['name'])
subprocess.call(['rm','-fr',fullpath_output])
#print(" ".join(['rm','-fr',fullpath_output]))
@@ -259,16 +266,17 @@ class Search:
kind = 'gdoc',
created_time = item['createdTime'],
modified_time = item['modifiedTime'],
indexed_time = datetime.now().replace(microsecond=0).isoformat(),
title = item['name'],
url = item['webViewLink'],
mimetype = mimetype,
owner_email = item['owners'][0]['emailAddress'],
owner_name = item['owners'][0]['displayName'],
repo_name=None,
repo_url=None,
github_user=None,
issue_title=None,
issue_url=None,
repo_name='',
repo_url='',
github_user='',
issue_title='',
issue_url='',
content = content
)
@@ -277,7 +285,7 @@ class Search:
"""
Add a Github issue/comment to a search index.
"""
repo_name = repo.name
repo_name = repo.owner.login+"/"+repo.name
repo_url = repo.html_url
count = 0
@@ -285,39 +293,62 @@ class Search:
# Handle the issue content
print("Indexing issue %s"%(issue.html_url))
created_time = clean_timestamp(issue.created_at)
modified_time = clean_timestamp(issue.updated_at)
indexed_time = clean_timestamp(datetime.now())
writer.add_document(
id = issue.html_url,
kind = 'issue',
created_time = created_time,
modified_time = modified_time,
indexed_time = indexed_time,
title = issue.title,
url = issue.html_url,
is_comment = False,
timestamp = issue.created_at,
mimetype='',
owner_email='',
owner_name='',
repo_name = repo_name,
repo_url = repo_url,
github_user = issue.user.login,
issue_title = issue.title,
issue_url = issue.html_url,
user = issue.user.login,
content = issue.body.rstrip()
)
count += 1
# Handle the comments content
if(issue.comments>0):
comments = issue.get_comments()
for comment in comments:
print(" > Indexing comment %s"%(comment.html_url))
created_time = clean_timestamp(comment.created_at)
modified_time = clean_timestamp(comment.updated_at)
indexed_time = clean_timestamp(datetime.now())
writer.add_document(
id = comment.html_url,
kind = 'comment',
created_time = created_time,
modified_time = modified_time,
indexed_time = indexed_time,
title = "Comment on "+issue.title,
url = comment.html_url,
is_comment = True,
timestamp = comment.created_at,
mimetype='',
owner_email='',
owner_name='',
repo_name = repo_name,
repo_url = repo_url,
github_user = comment.user.login,
issue_title = issue.title,
issue_url = issue.html_url,
user = comment.user.login,
content = comment.body.strip()
content = comment.body.rstrip()
)
count += 1
@@ -354,24 +385,49 @@ class Search:
drive = service.files()
# We should do more here
# to check if we should update
# or not...
#
# loop over existing documents in index:
#
# p = QueryParser("kind", schema=self.ix.schema)
# q = p.parse("gdoc")
# with self.ix.searcher() as s:
# results = s.search(q,limit=None)
# counts[key] = len(results)
# The trick is to set next page token to None 1st time thru (fencepost)
nextPageToken = None
# Use the pager to return all the things
items = []
while True:
ps = 12
results = drive.list(
pageSize=100,
pageSize=ps,
pageToken=nextPageToken,
fields="files(id, kind, createdTime, modifiedTime, mimeType, name, owners, webViewLink)",
fields="nextPageToken, files(id, kind, createdTime, modifiedTime, mimeType, name, owners, webViewLink)",
spaces="drive"
).execute()
nextPageToken = results.get("nextPageToken")
items += results.get("files", [])
if nextPageToken is None:
break
# Keep it short
break
#if nextPageToken is None:
# break
# Here is where we update.
# Grab indexed ids
# Grab remote ids
# Drop indexed ids not in remote ids
# Index all remote ids
# Change add_ to update_
# Add a hash check in update_
indexed_ids = set()
for item in items:
@@ -386,9 +442,12 @@ class Search:
count = 0
for item in items:
self.add_item(writer, item, indexed_ids, temp_dir, config)
self.add_drive_file(writer, item, indexed_ids, temp_dir, config)
count += 1
print("Cleaning temporary directory: %s"%(temp_dir))
subprocess.call(['rm','-fr',temp_dir])
writer.commit()
print("Done, updated %d documents in the index" % count)
@@ -414,14 +473,14 @@ class Search:
writer = self.ix.writer()
# Iterate over each repo
list_of_repos = config['repos']
list_of_repos = config['repositories']
for r in list_of_repos:
if '/' not in r:
err = "Error: specify org/reponame or user/reponame in list of repos"
raise Exception(err)
this_repo, this_org = re.split('/',r)
this_org, this_repo = re.split('/',r)
org = g.get_organization(this_org)
repo = org.get_repo(this_repo)
@@ -441,6 +500,7 @@ class Search:
to_index.add(issue.html_url)
writer.delete_by_term('url', issue.html_url)
count -= 1
comments = issue.get_comments()
for comment in comments:
@@ -477,11 +537,6 @@ class Search:
# contains a {% for e in entries %}
# and then an {{e.score}}
# ------------------
# cheseburger
# create search results
sr = SearchResult()
sr.score = r.score
@@ -495,37 +550,29 @@ class Search:
sr.id = r['id']
sr.kind = r['kind']
sr.url = r['url']
sr.created_time = r['created_time']
sr.modified_time = r['modified_time']
sr.indexed_time = r['indexed_time']
sr.title = r['title']
sr.url = r['url']
sr.mimetype = r['mimetype']
sr.owner_email = r['owner_email']
sr.owner_name = r['owner_name']
sr.content = r['content']
# -----------------
# github isuses
# create search results
sr = SearchResult()
sr.score = r.score
sr.url = r['url']
sr.title = r['issue_title']
sr.repo_name = r['repo_name']
sr.repo_url = r['repo_url']
sr.issue_title = r['issue_title']
sr.issue_url = r['issue_url']
sr.is_comment = r['is_comment']
sr.github_user = r['github_user']
sr.content = r['content']
# ------------------
highlights = r.highlights('content')
if not highlights:
# just use the first 1,000 words of the document
@@ -558,27 +605,15 @@ class Search:
elif len(fields) == 2:
pass
else:
fields = ['id',
'kind',
'created_time',
'modified_time',
'indexed_time',
'title',
'url',
'mimetype',
'owner_email',
'owner_name',
'repo_name',
'repo_url',
'issue_title',
'issue_url',
'github_user',
'content']
# If the user does not specify a field,
# these are the fields that are actually searched
fields = ['title',
'content']
if not query:
query = MultifieldParser(fields, schema=self.ix.schema).parse(query_string)
parsed_query = "%s" % query
print("query: %s" % parsed_query)
results = searcher.search(query, terms=False, scored=True, groupedby="url")
results = searcher.search(query, terms=False, scored=True, groupedby="kind")
search_result = self.create_search_result(results)
return parsed_query, search_result
@@ -589,7 +624,29 @@ class Search:
return s if len(s) <= l else s[0:l - 3] + '...'
def get_document_total_count(self):
return self.ix.searcher().doc_count_all()
p = QueryParser("kind", schema=self.ix.schema)
kind_labels = {
"documents" : "gdoc",
"issues" : "issue",
"comments" : "comment"
}
counts = {
"documents" : None,
"issues" : None,
"comments" : None,
"total" : None
}
for key in kind_labels:
kind = kind_labels[key]
q = p.parse(kind)
with self.ix.searcher() as s:
results = s.search(q,limit=None)
counts[key] = len(results)
counts['total'] = self.ix.searcher().doc_count_all()
return counts
if __name__ == "__main__":
search = Search("search_index")

View File

@@ -1,7 +1,6 @@
{
"repositories" : [
"dcppc/2018-june-workshop",
"dcppc/2018-july-workshop",
"dcppc/data-stewards"
"dcppc/2018-july-workshop"
]
}

View File

@@ -1,27 +1,9 @@
# Path to markdown files
MARKDOWN_FILES_DIR = "/Users/charles/codes/whoosh/markdown-search/fake-docs/"
# Location of index file
INDEX_DIR = "search_index"
# Command to use when clicking on filepath in search results
EDIT_COMMAND = "view"
# Toggle to show Whoosh parsed query
SHOW_PARSED_QUERY=True
# Toogle to use tags
USE_TAGS=True
# Optional prefix in a markdown file, e.g. "tags: python search markdown tutorial"
TAGS_PREFIX=""
# List of tags that should be ignored
TAGS_TO_IGNORE = "and are what how its not with the"
# Regular expression to select tags, eg tag has to start with alphanumeric followed by at least two alphanumeric or "-" or "."
TAGS_REGEX = r"\b([A-Za-z0-9][A-Za-z0-9-.]{2,})\b"
# Flask settings
DEBUG = True
SECRET_KEY = '42c5a8eda356ca9d9c3ab2d149541e6b91d843fa'

54
docs/index.md Normal file
View File

@@ -0,0 +1,54 @@
# The Centillion
**the centillion**: a pan-github-markdown-issues-google-docs search engine.
**a centillion**: a very large number consisting of a 1 with 303 zeros after it.
the centillion is 3.03 log-times better than the googol.
## what is it
The centillion is a search engine built using [whoosh](https://whoosh.readthedocs.io/en/latest/intro.html),
a Python library for building search engines.
We define the types of documents the centillion should index,
what info and how. The centillion then builds and
updates a search index. That's all done in `centillion_search.py`.
The centillion also provides a simple web frontend for running
queries against the search index. That's done using a Flask server
defined in `centillion.py`.
The centillion keeps it simple.
## quickstart
Run the centillion app with a github access token API key set via
environment variable:
```
GITHUB_TOKEN="XXXXXXXX" python centillion.py
```
This will start a Flask server, and you can view the minimal search engine
interface in your browser at <http://localhost:5000>.
## work that is done
See [standalone.md](standalone.md) for the summary of
the three standalone whoosh servers that were built:
one for a folder of markdown files, one for github issues
and comments, and one for google drive documents.
## work that is being done
See [workinprogress.md](workinprogress.md) for details about
work in progress.
## work that is planned
See [plans.md](plans.md)

View File

@@ -31,3 +31,4 @@ Stateless

View File

@@ -1,4 +1,4 @@
## work that is done
## work that is done: standalone
**Stage 1: index folder of markdown files** (done)
* See [markdown-search](https://git.charlesreid1.com/charlesreid1/markdown-search.git)
@@ -13,7 +13,7 @@
Needs work:
* More appropriate schema
* <s>More appropriate schema</s>
* Using more features (weights) plus pandoc filters for schema
* Sqlalchemy (and hey waddya know safari books has it covered)
@@ -25,15 +25,16 @@ Needs work:
* Main win here is uncovering metadata/linking/presentation issues
Needs work:
- treat comments and issues as separate objects, fill out separate schema fields
- <s>treat comments and issues as separate objects, fill out separate schema fields
- map out and organize how the schema is updated to make it more flexible
- configuration needs to enable user to specify organization+repos
- configuration needs to enable user to specify organization+repos</s>
```plain
{
"to_index" : {
"google" : "google-api-python-client",
"microsoft" : ["TypeCode","api-guidelines"]
"to_index" : [
"google/google-api-python-client",
"microsoft/TypeCode",
"microsoft/api-guielines"
}
}
```
@@ -48,3 +49,4 @@ Needs work:
* Use the google drive api (see simple-simon)
* Main win is more uncovering of metadata issues, identifying
big-picture issues for centillion

48
docs/workinprogress.md Normal file
View File

@@ -0,0 +1,48 @@
# Components
The components of centillion are as follows:
- Flask application, which creates a Search object and uses it to search index
- Search object, which allows you to create/update/search an index
## Routes layout
Centillion flask app routes:
- `/home`
- if not logged in, landing page
- if logged in, redirect to search
- `/search`
- `/main_index_update`
- update main index, all docs period
## Functions layout
Centillion Search class functions:
- `open_index()` creates the schema
- `add_issue()`, `add_md()`, `add_document()` have three diff method sigs and add diff types
of documents to the search index
- `update_all_issues()` or `update_all_md()` or `update_all_documents()` iterates over items
and determines whether each item needs to be updated in the search index
- `update_main_index()` - update the entire search index
- calls all three update_all methods
- `create_search_results()` - package things up for jinja
- `search()` - run the query, pass results to the jinja-packager
Nice to have but focus on it later:
- update diff search index (what's been added since last index time)
- max index time
## Files layout
Schema definition:
* include a "kind" or "class" to group objects
* can provide different searches of different collections
* eventually can provide user with checkboxes

BIN
img/ss.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 356 KiB

1
mkdocs-material Submodule

Submodule mkdocs-material added at 6569122bb1

6
static/bootstrap.min.css vendored Normal file

File diff suppressed because one or more lines are too long

BIN
static/centillion_black.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

BIN
static/centillion_white.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

View File

@@ -1,3 +1,24 @@
li.search-group-item {
position: relative;
display: block;
padding: 0px;
margin-bottom: -1px;
background-color: #fff;
border: 1px solid #ddd;
}
div.list-group {
border: 1px solid rgba(86,61,124,.2);
}
div.url {
background-color: rgba(86,61,124,.15);
padding: 8px;
}
/***************************/
body {
font-family: sans-serif;
}
@@ -56,7 +77,7 @@ table {
overflow: hidden;
}
td.info, .last-searches {
.info, .last-searches {
color: gray;
font-size: 12px;
font-family: Arial, serif;

View File

@@ -1,7 +1,8 @@
<!doctype html>
<title>Markdown Search</title>
<link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='github-markdown.css') }}">
<link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='style.css') }}">
<link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='github-markdown.css') }}">
<link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='bootstrap.min.css') }}">
<div>
{% for message in get_flashed_messages() %}
<div class="flash">{{ message }}</div>

View File

@@ -1,62 +1,151 @@
{% extends "layout.html" %}
{% block body %}
<h1><a href="{{ url_for('search')}}?query=&fields=">Search directory: {{ config.MARKDOWN_FILES_DIR }}</a></h1>
<a class="index" href="{{ url_for('update_index')}}">[update index]</a>
<a class="index" href="{{ url_for('update_index')}}?rebuild=True">[rebuild index]</a>
<form action="{{ url_for('search') }}" name="search">
<input type="text" name="query" value="{{ query }}">
<input type="submit" value="search">
<a href="{{ url_for('search')}}?query=&fields=">[clear]</a>
</form>
<table cellspacing="3">
{% if directories %}
<tr>
<td class="directories-cloud">File directories:&nbsp
<div class="container">
<div class="row">
<div class="col12sm">
<center>
<a href="{{ url_for('search')}}?query=&fields=">
<img src="{{ url_for('static', filename='centillion_white.png') }}">
</a>
</center>
</div>
</div>
<div class="row">
<div class="col12sm">
<center>
<h2>
<a href="{{ url_for('search')}}?query=&fields=">
Search the DCPPC
</a>
</h2>
</center>
</div>
</div>
<div class="row">
<div class="col-12">
<center>
<a class="index" href="{{ url_for('update_index')}}">[update index]</a>
<a class="index" href="{{ url_for('update_index')}}?rebuild=True">[rebuild index]</a>
<form action="{{ url_for('search') }}" name="search">
<input type="text" name="query" value="{{ query }}"> <br />
<button type="submit" style="font-size: 20px; padding: 10px; padding-left: 50px; padding-right: 50px;"
value="search" class="btn btn-primary">Search</button>
<br />
<a href="{{ url_for('search')}}?query=&fields=">[clear all results]</a>
</form>
</center>
</div>
</div>
</div>
<div class="container">
<div class="row">
{% if directories %}
<div class="col-12 info directories-cloud">
File directories:&nbsp
{% for d in directories %}
<a href="{{url_for('search')}}?query={{d|trim}}&fields=filename">{{d|trim}}</a>
{% endfor %}
</td>
</tr>
{% endif %}
{% if config['SHOW_PARSED_QUERY']%}
<tr>
<td class="info">Parsed query: {{ parsed_query }}</td>
</tr>
{% endif %}
<tr>
<td class="info">FOUND {{ entries | length }} results of {{total}} documents</td>
</tr>
</div>
{% endif %}
{% for e in entries %}
<tr>
<td class="search-result">
<!--
<div class="path"><a href='{{ url_for("open_file")}}?path={{e.path|urlencode}}&query={{query}}&fields={{fields}}'>{{e.path}}</a>score: {{'%d' % e.score}}</div>
-->
<div class="url">
{% if e.is_comment %}
<b>Comment</b> <a href='{{e.url}}'>(comment link)</a>
on issue <a href='{{e.issue_url}}'>{{e.issue_title}}</a>
in repo <a href='{{e.repo_url}}'>dcppc/{{e.repo_name}}</a>
<br />
{% else %}
<b>Issue</b> <a href='{{e.issue_url}}'>{{e.issue_title}}</a>
in repo <a href='{{e.repo_url}}'>dcppc/{{e.repo_name}}</a>
<br />
{% endif %}
score: {{'%d' % e.score}}
</div>
<div class="markdown-body">{{ e.content_highlight|safe}}</div>
</td>
</tr>
{% endfor %}
</table>
<div class="last-searches">Last searches: <br/>
{% for s in last_searches %}
<span><a href="{{url_for('search')}}?{{s}}">{{s}}</a></span>
{% endfor %}
<ul class="list-group">
{% if config['SHOW_PARSED_QUERY'] and parsed_query %}
<li class="list-group-item">
<div class="col-12 info">
<b>Parsed query:</b> {{ parsed_query }}
</div>
</li>
{% endif %}
{% if parsed_query %}
<li class="list-group-item">
<div class="col-12 info">
<b>Found:</b> {{entries|length}} documents with results, out of {{totals["total"]}} total documents
</div>
</li>
{% endif %}
<li class="list-group-item">
<div class="col-12 info">
<b>Indexing:</b> {{totals["documents"]}} Google Documents,
{{totals["issues"]}} Github issues, and
{{totals["comments"]}} Github comments
</div>
</li>
</ul>
</div>
</div>
<p>
More info can be found in the <a href="https://github.com/BernhardWenzel/markdown-search">README.md file</a>
</p>
<div class="container">
<div class="row">
<ul class="list-group">
{% for e in entries %}
<li class="search-group-item">
<div class="url">
{% if e.kind=="gdoc" %}
<b>Google Drive File:</b>
<a href='{{e.url}}'>{{e.title}}</a>
({{e.owner_name}}, {{e.owner_email}})
{% elif e.kind=="comment" %}
<b>Comment:</b>
<a href='{{e.url}}'>Comment (link)</a>
{% if e.github_user %}
by <a href='https://github.com/{{e.github_user}}'>@{{e.github_user}}</a>
{% endif %}
on issue <a href='{{e.issue_url}}'>{{e.issue_title}}</a>
<br/>
<b>Repository:</b> <a href='{{e.repo_url}}'>{{e.repo_name}}</a>
{% if e.github_user %}
{% endif %}
{% elif e.kind=="issue" %}
<b>Issue:</b>
<a href='{{e.issue_url}}'>{{e.issue_title}}</a>
{% if e.github_user %}
by <a href='https://github.com/{{e.github_user}}'>@{{e.github_user}}</a>
{% endif %}
<br/>
<b>Repository:</b> <a href='{{e.repo_url}}'>{{e.repo_name}}</a>
{% else %}
<b>Item:</b> (<a href='{{e.url}}'>link</a>)
{% endif %}
<br />
score: {{'%d' % e.score}}
</div>
<div class="markdown-body">{{ e.content_highlight|safe}}</div>
</li>
{% endfor %}
</ul>
</div>
</div>
<div class="container">
<div class="row">
<div class="col-12">
<div class="last-searches">Last searches: <br/>
{% for s in last_searches %}
<span><a href="{{url_for('search')}}?{{s}}">{{s}}</a></span>
{% endfor %}
</div>
<p>
More info can be found in the <a href="https://github.com/BernhardWenzel/markdown-search">README.md file</a>
</p>
</div>
</div>
</div>
{% endblock %}