Compare commits
38 Commits
Author | SHA1 | Date | |
---|---|---|---|
1a04814edf | |||
![]() |
3fb72d409b | ||
d89e01221a | |||
6736f3f8ad | |||
abd13aba29 | |||
13e49cdaa6 | |||
83b2ce17fb | |||
5be0709070 | |||
9edd95a78d | |||
37615d8707 | |||
4b218f63b9 | |||
4e17c890bc | |||
1129ec38e0 | |||
875508c796 | |||
abc7a2aedf | |||
8f1e5faefc | |||
d5f63e2322 | |||
84e5560423 | |||
924c562c0a | |||
13c410ac5e | |||
4e79800e83 | |||
5b9570d8cd | |||
297a4b5977 | |||
69a6b5d680 | |||
3feca1aba3 | |||
493581f861 | |||
1b0ded809d | |||
78e77c7cf2 | |||
2f890d1aee | |||
937327f2cb | |||
ca0d88cfe6 | |||
5eda472072 | |||
d943c14678 | |||
6be785a056 | |||
65113a95f7 | |||
87c3f12c8f | |||
933884e9ab | |||
da9dea3f6b |
2
.gitignore
vendored
2
.gitignore
vendored
@@ -1,4 +1,4 @@
|
|||||||
config_*
|
config_flask.py
|
||||||
vp
|
vp
|
||||||
credentials.json
|
credentials.json
|
||||||
drive*.json
|
drive*.json
|
||||||
|
6
.gitmodules
vendored
6
.gitmodules
vendored
@@ -1,3 +1,3 @@
|
|||||||
[submodule "mkdocs-material"]
|
[submodule "mkdocs-material-dib"]
|
||||||
path = mkdocs-material
|
path = mkdocs-material-dib
|
||||||
url = https://git.charlesreid1.com/charlesreid1/mkdocs-material.git
|
url = https://github.com/dib-lab/mkdocs-material-dib.git
|
||||||
|
45
Readme.md
45
Readme.md
@@ -1,18 +1,19 @@
|
|||||||
# The Centillion
|
# The Centillion
|
||||||
|
|
||||||
**the centillion**: a pan-github-markdown-issues-google-docs search engine.
|
**centillion**: a pan-github-markdown-issues-google-docs search engine.
|
||||||
|
|
||||||
**a centillion**: a very large number consisting of a 1 with 303 zeros after it.
|
**a centillion**: a very large number consisting of a 1 with 303 zeros after it.
|
||||||
|
|
||||||
the centillion is 3.03 log-times better than the googol.
|
one centillion is 3.03 log-times better than a googol.
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
|
|
||||||
## what is it
|
## what is it
|
||||||
|
|
||||||
The centillion is a search engine built using [whoosh](https://whoosh.readthedocs.io/en/latest/intro.html),
|
Centillion (https://github.com/dcppc/centillion) is a search engine that can index
|
||||||
a Python library for building search engines.
|
three kinds of collections: Google Documents, Github issues, and Markdown files in
|
||||||
|
Github repos.
|
||||||
|
|
||||||
We define the types of documents the centillion should index,
|
We define the types of documents the centillion should index,
|
||||||
what info and how. The centillion then builds and
|
what info and how. The centillion then builds and
|
||||||
@@ -24,6 +25,30 @@ defined in `centillion.py`.
|
|||||||
|
|
||||||
The centillion keeps it simple.
|
The centillion keeps it simple.
|
||||||
|
|
||||||
|
## authentication layer
|
||||||
|
|
||||||
|
Centillion lives behind a Github authentication layer, implemented with
|
||||||
|
[flask-dance](https://github.com/singingwolfboy/flask-dance). When you first
|
||||||
|
visit the site it will ask you to authenticate with Github so that it can
|
||||||
|
verify you have permission to access the site.
|
||||||
|
|
||||||
|
## technologies
|
||||||
|
|
||||||
|
Centillion is a Python program built using whoosh (search engine library). It
|
||||||
|
indexes the full text of docx files in Google Documents, just the filenames for
|
||||||
|
non-docx files. The full text of issues and their comments are indexed, and
|
||||||
|
results are grouped by issue. Centillion requires Google Drive and Github OAuth
|
||||||
|
apps. Once you provide credentials to Flask you're all set to go.
|
||||||
|
|
||||||
|
|
||||||
|
## control panel
|
||||||
|
|
||||||
|
There's also a control panel at <https://search.nihdatacommons.us/control_panel>
|
||||||
|
that allows you to rebuild the search index from scratch (the Google Drive indexing
|
||||||
|
takes a while).
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
|
||||||
## quickstart (with Github auth)
|
## quickstart (with Github auth)
|
||||||
|
|
||||||
@@ -31,6 +56,8 @@ Start by creating a Github OAuth application.
|
|||||||
Get the public and private application key
|
Get the public and private application key
|
||||||
(client token and client secret token)
|
(client token and client secret token)
|
||||||
from the Github application's page.
|
from the Github application's page.
|
||||||
|
You will also need a Github access token
|
||||||
|
(in addition to the app tokens).
|
||||||
|
|
||||||
When you create the application, set the callback
|
When you create the application, set the callback
|
||||||
URL to `/login/github/authorized`, as in:
|
URL to `/login/github/authorized`, as in:
|
||||||
@@ -65,11 +92,3 @@ as HTTP by Github, even though there is an HTTPS address, and
|
|||||||
everything else seems fine, try deleting the Github OAuth app
|
everything else seems fine, try deleting the Github OAuth app
|
||||||
and creating a new one.
|
and creating a new one.
|
||||||
|
|
||||||
|
|
||||||
## more info
|
|
||||||
|
|
||||||
For more info see the documentation: <https://charlesreid1.github.io/centillion>
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@@ -27,10 +27,10 @@ You provide:
|
|||||||
|
|
||||||
|
|
||||||
class UpdateIndexTask(object):
|
class UpdateIndexTask(object):
|
||||||
def __init__(self, gh_oauth_token, diff_index=False):
|
def __init__(self, gh_access_token, diff_index=False):
|
||||||
self.diff_index = diff_index
|
self.diff_index = diff_index
|
||||||
thread = threading.Thread(target=self.run, args=())
|
thread = threading.Thread(target=self.run, args=())
|
||||||
self.gh_oauth_token = gh_oauth_token
|
self.gh_access_token = gh_access_token
|
||||||
thread.daemon = True
|
thread.daemon = True
|
||||||
thread.start()
|
thread.start()
|
||||||
|
|
||||||
@@ -43,8 +43,8 @@ class UpdateIndexTask(object):
|
|||||||
from get_centillion_config import get_centillion_config
|
from get_centillion_config import get_centillion_config
|
||||||
config = get_centillion_config('config_centillion.json')
|
config = get_centillion_config('config_centillion.json')
|
||||||
|
|
||||||
search.update_index_markdown(self.gh_oauth_token,config)
|
search.update_index_issues(self.gh_access_token,config)
|
||||||
search.update_index_issues(self.gh_oauth_token,config)
|
search.update_index_markdown(self.gh_access_token,config)
|
||||||
search.update_index_gdocs(config)
|
search.update_index_gdocs(config)
|
||||||
|
|
||||||
|
|
||||||
@@ -55,11 +55,11 @@ app.wsgi_app = ProxyFix(app.wsgi_app)
|
|||||||
# Load default config and override config from an environment variable
|
# Load default config and override config from an environment variable
|
||||||
app.config.from_pyfile("config_flask.py")
|
app.config.from_pyfile("config_flask.py")
|
||||||
|
|
||||||
github_bp = make_github_blueprint()
|
#github_bp = make_github_blueprint()
|
||||||
#github_bp = make_github_blueprint(
|
github_bp = make_github_blueprint(
|
||||||
# client_id = os.environ.get('GITHUB_OAUTH_CLIENT_ID'),
|
client_id = os.environ.get('GITHUB_OAUTH_CLIENT_ID'),
|
||||||
# client_secret = os.environ.get('GITHUB_OAUTH_CLIENT_SECRET'),
|
client_secret = os.environ.get('GITHUB_OAUTH_CLIENT_SECRET'),
|
||||||
# scope='read:org')
|
scope='read:org')
|
||||||
|
|
||||||
app.register_blueprint(github_bp, url_prefix="/login")
|
app.register_blueprint(github_bp, url_prefix="/login")
|
||||||
|
|
||||||
@@ -172,11 +172,13 @@ def update_index():
|
|||||||
mresp = github.get('/teams/%s/members/%s'%(copper_team_id,username))
|
mresp = github.get('/teams/%s/members/%s'%(copper_team_id,username))
|
||||||
if mresp.status_code==204:
|
if mresp.status_code==204:
|
||||||
|
|
||||||
gh_oauth_token = github.token['access_token']
|
#gh_oauth_token = github.token['access_token']
|
||||||
|
gh_access_token = app.config['GITHUB_TOKEN']
|
||||||
|
|
||||||
# --------------------
|
# --------------------
|
||||||
# Business as usual
|
# Business as usual
|
||||||
UpdateIndexTask(gh_oauth_token, diff_index=False)
|
UpdateIndexTask(gh_access_token,
|
||||||
|
diff_index=False)
|
||||||
flash("Rebuilding index, check console output")
|
flash("Rebuilding index, check console output")
|
||||||
return render_template("controlpanel.html",
|
return render_template("controlpanel.html",
|
||||||
totals={})
|
totals={})
|
||||||
@@ -216,5 +218,6 @@ def oops(e):
|
|||||||
return contents404
|
return contents404
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
|
os.environ['OAUTHLIB_INSECURE_TRANSPORT'] = 'true'
|
||||||
app.run(host="0.0.0.0",port=5000)
|
app.run(host="0.0.0.0",port=5000)
|
||||||
|
|
||||||
|
@@ -1,7 +1,7 @@
|
|||||||
import shutil
|
import shutil
|
||||||
import html.parser
|
import html.parser
|
||||||
|
|
||||||
from github import Github
|
from github import Github, GithubException
|
||||||
import base64
|
import base64
|
||||||
|
|
||||||
from gdrive_util import GDrive
|
from gdrive_util import GDrive
|
||||||
@@ -252,7 +252,6 @@ class Search:
|
|||||||
with open(fullpath_input, 'wb') as f:
|
with open(fullpath_input, 'wb') as f:
|
||||||
f.write(r.content)
|
f.write(r.content)
|
||||||
|
|
||||||
|
|
||||||
# Try to convert docx file to plain text
|
# Try to convert docx file to plain text
|
||||||
try:
|
try:
|
||||||
output = pypandoc.convert_file(fullpath_input,
|
output = pypandoc.convert_file(fullpath_input,
|
||||||
@@ -316,7 +315,7 @@ class Search:
|
|||||||
# to a search index.
|
# to a search index.
|
||||||
|
|
||||||
|
|
||||||
def add_issue(self, writer, issue, config, update=True):
|
def add_issue(self, writer, issue, gh_access_token, config, update=True):
|
||||||
"""
|
"""
|
||||||
Add a Github issue/comment to a search index.
|
Add a Github issue/comment to a search index.
|
||||||
"""
|
"""
|
||||||
@@ -368,7 +367,7 @@ class Search:
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
def add_markdown(self, writer, d, config, update=True):
|
def add_markdown(self, writer, d, gh_access_token, config, update=True):
|
||||||
"""
|
"""
|
||||||
Use a Github markdown document API record
|
Use a Github markdown document API record
|
||||||
to add a markdown document's contents to
|
to add a markdown document's contents to
|
||||||
@@ -385,18 +384,27 @@ class Search:
|
|||||||
_, fname = os.path.split(fpath)
|
_, fname = os.path.split(fpath)
|
||||||
_, fext = os.path.splitext(fpath)
|
_, fext = os.path.splitext(fpath)
|
||||||
|
|
||||||
print("Indexing markdown doc %s"%(fname))
|
print("Indexing markdown doc %s from repo %s"%(fname,repo_name))
|
||||||
|
|
||||||
# Unpack the requests response and decode the content
|
# Unpack the requests response and decode the content
|
||||||
response = requests.get(furl)
|
#
|
||||||
|
# don't forget the headers for private repos!
|
||||||
|
# useful: https://bit.ly/2LSAflS
|
||||||
|
|
||||||
|
headers = {'Authorization' : 'token %s'%(gh_access_token)}
|
||||||
|
|
||||||
|
response = requests.get(furl, headers=headers)
|
||||||
|
if response.status_code==200:
|
||||||
jresponse = response.json()
|
jresponse = response.json()
|
||||||
content = ""
|
content = ""
|
||||||
try:
|
try:
|
||||||
binary_content = re.sub('\n','',jresponse['content'])
|
binary_content = re.sub('\n','',jresponse['content'])
|
||||||
content = base64.b64decode(binary_content).decode('utf-8')
|
content = base64.b64decode(binary_content).decode('utf-8')
|
||||||
|
|
||||||
except KeyError:
|
except KeyError:
|
||||||
print(" > XXXXXXXX Failed to extract 'content' field. You probably hit the rate limit.")
|
print(" > XXXXXXXX Failed to extract 'content' field. You probably hit the rate limit.")
|
||||||
|
|
||||||
|
else:
|
||||||
|
print(" > XXXXXXXX Failed to reach file URL. There may be a problem with authentication/headers.")
|
||||||
return
|
return
|
||||||
|
|
||||||
# Now create the actual search index record
|
# Now create the actual search index record
|
||||||
@@ -431,6 +439,10 @@ class Search:
|
|||||||
# Define how to update search index
|
# Define how to update search index
|
||||||
# using different kinds of collections
|
# using different kinds of collections
|
||||||
|
|
||||||
|
|
||||||
|
# ------------------------------
|
||||||
|
# Google Drive Files/Documents
|
||||||
|
|
||||||
def update_index_gdocs(self,
|
def update_index_gdocs(self,
|
||||||
config):
|
config):
|
||||||
"""
|
"""
|
||||||
@@ -478,7 +490,7 @@ class Search:
|
|||||||
remote_ids = set()
|
remote_ids = set()
|
||||||
full_items = {}
|
full_items = {}
|
||||||
while True:
|
while True:
|
||||||
ps = 12
|
ps = 100
|
||||||
results = drive.list(
|
results = drive.list(
|
||||||
pageSize=ps,
|
pageSize=ps,
|
||||||
pageToken=nextPageToken,
|
pageToken=nextPageToken,
|
||||||
@@ -496,11 +508,11 @@ class Search:
|
|||||||
# Also store the doc
|
# Also store the doc
|
||||||
full_items[f['id']] = f
|
full_items[f['id']] = f
|
||||||
|
|
||||||
# Shorter:
|
## Shorter:
|
||||||
break
|
|
||||||
## Longer:
|
|
||||||
#if nextPageToken is None:
|
|
||||||
#break
|
#break
|
||||||
|
# Longer:
|
||||||
|
if nextPageToken is None:
|
||||||
|
break
|
||||||
|
|
||||||
|
|
||||||
writer = self.ix.writer()
|
writer = self.ix.writer()
|
||||||
@@ -544,13 +556,13 @@ class Search:
|
|||||||
print("Done, updated %d documents in the index" % count)
|
print("Done, updated %d documents in the index" % count)
|
||||||
|
|
||||||
|
|
||||||
|
# ------------------------------
|
||||||
|
# Github Issues/Comments
|
||||||
|
|
||||||
def update_index_issues(self, gh_oauth_token, config):
|
def update_index_issues(self, gh_access_token, config):
|
||||||
"""
|
"""
|
||||||
Update the search index using a collection of
|
Update the search index using a collection of
|
||||||
Github repo issues and comments.
|
Github repo issues and comments.
|
||||||
|
|
||||||
gh_oauth_token can also be an access token.
|
|
||||||
"""
|
"""
|
||||||
# Updated algorithm:
|
# Updated algorithm:
|
||||||
# - get set of indexed ids
|
# - get set of indexed ids
|
||||||
@@ -572,25 +584,29 @@ class Search:
|
|||||||
# Get the set of remote ids:
|
# Get the set of remote ids:
|
||||||
# ------
|
# ------
|
||||||
# Start with api object
|
# Start with api object
|
||||||
g = Github(gh_oauth_token)
|
g = Github(gh_access_token)
|
||||||
|
|
||||||
# Now index all issue threads in the user-specified repos
|
# Now index all issue threads in the user-specified repos
|
||||||
|
|
||||||
# Iterate over each repo
|
|
||||||
list_of_repos = config['repositories']
|
|
||||||
for r in list_of_repos:
|
|
||||||
|
|
||||||
# Start by collecting all the things
|
# Start by collecting all the things
|
||||||
remote_issues = set()
|
remote_issues = set()
|
||||||
full_items = {}
|
full_items = {}
|
||||||
|
|
||||||
|
# Iterate over each repo
|
||||||
|
list_of_repos = config['repositories']
|
||||||
|
for r in list_of_repos:
|
||||||
|
|
||||||
if '/' not in r:
|
if '/' not in r:
|
||||||
err = "Error: specify org/reponame or user/reponame in list of repos"
|
err = "Error: specify org/reponame or user/reponame in list of repos"
|
||||||
raise Exception(err)
|
raise Exception(err)
|
||||||
|
|
||||||
this_org, this_repo = re.split('/',r)
|
this_org, this_repo = re.split('/',r)
|
||||||
|
try:
|
||||||
org = g.get_organization(this_org)
|
org = g.get_organization(this_org)
|
||||||
repo = org.get_repo(this_repo)
|
repo = org.get_repo(this_repo)
|
||||||
|
except:
|
||||||
|
print("Error: could not gain access to repository %s"%(r))
|
||||||
|
continue
|
||||||
|
|
||||||
# Iterate over each issue thread
|
# Iterate over each issue thread
|
||||||
issues = repo.get_issues()
|
issues = repo.get_issues()
|
||||||
@@ -622,7 +638,7 @@ class Search:
|
|||||||
# cop out
|
# cop out
|
||||||
writer.delete_by_term('id',update_issue)
|
writer.delete_by_term('id',update_issue)
|
||||||
item = full_items[update_issue]
|
item = full_items[update_issue]
|
||||||
self.add_issue(writer, item, config, update=True)
|
self.add_issue(writer, item, gh_access_token, config, update=True)
|
||||||
count += 1
|
count += 1
|
||||||
|
|
||||||
|
|
||||||
@@ -631,7 +647,7 @@ class Search:
|
|||||||
add_issues = remote_issues - indexed_issues
|
add_issues = remote_issues - indexed_issues
|
||||||
for add_issue in add_issues:
|
for add_issue in add_issues:
|
||||||
item = full_items[add_issue]
|
item = full_items[add_issue]
|
||||||
self.add_issue(writer, item, config, update=False)
|
self.add_issue(writer, item, gh_access_token, config, update=False)
|
||||||
count += 1
|
count += 1
|
||||||
|
|
||||||
|
|
||||||
@@ -640,13 +656,13 @@ class Search:
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
# ------------------------------
|
||||||
|
# Github Markdown Files
|
||||||
|
|
||||||
def update_index_markdown(self, gh_oauth_token, config):
|
def update_index_markdown(self, gh_access_token, config):
|
||||||
"""
|
"""
|
||||||
Update the search index using a collection of
|
Update the search index using a collection of
|
||||||
Markdown files from a Github repo.
|
Markdown files from a Github repo.
|
||||||
|
|
||||||
gh_oauth_token can also be an access token.
|
|
||||||
"""
|
"""
|
||||||
EXT = '.md'
|
EXT = '.md'
|
||||||
|
|
||||||
@@ -669,38 +685,48 @@ class Search:
|
|||||||
# Get the set of remote ids:
|
# Get the set of remote ids:
|
||||||
# ------
|
# ------
|
||||||
# Start with api object
|
# Start with api object
|
||||||
g = Github(gh_oauth_token)
|
g = Github(gh_access_token)
|
||||||
|
|
||||||
# Now index all markdown files
|
# Now index all markdown files
|
||||||
# in the user-specified repos
|
# in the user-specified repos
|
||||||
|
|
||||||
# Iterate over each repo
|
|
||||||
list_of_repos = config['repositories']
|
|
||||||
for r in list_of_repos:
|
|
||||||
|
|
||||||
# Start by collecting all the things
|
# Start by collecting all the things
|
||||||
remote_ids = set()
|
remote_ids = set()
|
||||||
full_items = {}
|
full_items = {}
|
||||||
|
|
||||||
|
# Iterate over each repo
|
||||||
|
list_of_repos = config['repositories']
|
||||||
|
for r in list_of_repos:
|
||||||
|
|
||||||
if '/' not in r:
|
if '/' not in r:
|
||||||
err = "Error: specify org/reponame or user/reponame in list of repos"
|
err = "Error: specify org/reponame or user/reponame in list of repos"
|
||||||
raise Exception(err)
|
raise Exception(err)
|
||||||
|
|
||||||
this_org, this_repo = re.split('/',r)
|
this_org, this_repo = re.split('/',r)
|
||||||
|
try:
|
||||||
org = g.get_organization(this_org)
|
org = g.get_organization(this_org)
|
||||||
repo = org.get_repo(this_repo)
|
repo = org.get_repo(this_repo)
|
||||||
|
except:
|
||||||
|
print("Error: could not gain access to repository %s"%(r))
|
||||||
|
continue
|
||||||
|
|
||||||
|
|
||||||
# ---------
|
# ---------
|
||||||
# begin markdown-specific code
|
# begin markdown-specific code
|
||||||
|
|
||||||
# Get head commit
|
# Get head commit
|
||||||
commits = repo.get_commits()
|
commits = repo.get_commits()
|
||||||
|
try:
|
||||||
last = commits[0]
|
last = commits[0]
|
||||||
sha = last.sha
|
sha = last.sha
|
||||||
|
except GithubException:
|
||||||
|
print("Error: could not get commits from repository %s"%(r))
|
||||||
|
continue
|
||||||
|
|
||||||
# Get all the docs
|
# Get all the docs
|
||||||
tree = repo.get_git_tree(sha=sha, recursive=True)
|
tree = repo.get_git_tree(sha=sha, recursive=True)
|
||||||
docs = tree.raw_data['tree']
|
docs = tree.raw_data['tree']
|
||||||
|
print("Parsing doc ids from repository %s"%(r))
|
||||||
|
|
||||||
for d in docs:
|
for d in docs:
|
||||||
|
|
||||||
@@ -736,10 +762,10 @@ class Search:
|
|||||||
# and in remote_ids
|
# and in remote_ids
|
||||||
update_ids = indexed_ids & remote_ids
|
update_ids = indexed_ids & remote_ids
|
||||||
for update_id in update_ids:
|
for update_id in update_ids:
|
||||||
# cop out
|
# cop out: just delete and re-add
|
||||||
writer.delete_by_term('id',update_id)
|
writer.delete_by_term('id',update_id)
|
||||||
item = full_items[update_id]
|
item = full_items[update_id]
|
||||||
self.add_markdown(writer, item, config, update=True)
|
self.add_markdown(writer, item, gh_access_token, config, update=True)
|
||||||
count += 1
|
count += 1
|
||||||
|
|
||||||
|
|
||||||
@@ -748,7 +774,7 @@ class Search:
|
|||||||
add_ids = remote_ids - indexed_ids
|
add_ids = remote_ids - indexed_ids
|
||||||
for add_id in add_ids:
|
for add_id in add_ids:
|
||||||
item = full_items[add_id]
|
item = full_items[add_id]
|
||||||
self.add_markdown(writer, item, config, update=False)
|
self.add_markdown(writer, item, gh_access_token, config, update=False)
|
||||||
count += 1
|
count += 1
|
||||||
|
|
||||||
|
|
||||||
@@ -757,6 +783,16 @@ class Search:
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
# ------------------------------
|
||||||
|
# Groups.io Emails
|
||||||
|
|
||||||
|
|
||||||
|
#def update_index_markdown(self, gh_access_token, config):
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------
|
# ---------------------------------
|
||||||
# Search results bundler
|
# Search results bundler
|
||||||
|
|
||||||
|
@@ -1,6 +1,27 @@
|
|||||||
{
|
{
|
||||||
"repositories" : [
|
"repositories" : [
|
||||||
|
"dcppc/project-management",
|
||||||
|
"dcppc/nih-demo-meetings",
|
||||||
|
"dcppc/internal",
|
||||||
|
"dcppc/organize",
|
||||||
|
"dcppc/dcppc-bot",
|
||||||
|
"dcppc/full-stacks",
|
||||||
|
"dcppc/markdown-issues",
|
||||||
|
"dcppc/design-guidelines-discuss",
|
||||||
|
"dcppc/dcppc-deliverables",
|
||||||
|
"dcppc/dcppc-milestones",
|
||||||
|
"dcppc/crosscut-metadata",
|
||||||
|
"dcppc/lucky-penny",
|
||||||
|
"dcppc/dcppc-workshops",
|
||||||
|
"dcppc/metadata-matrix",
|
||||||
|
"dcppc/data-stewards",
|
||||||
|
"dcppc/dcppc-phase1-demos",
|
||||||
|
"dcppc/apis",
|
||||||
"dcppc/2018-june-workshop",
|
"dcppc/2018-june-workshop",
|
||||||
"dcppc/2018-july-workshop"
|
"dcppc/2018-july-workshop",
|
||||||
|
"dcppc/2018-august-workshop",
|
||||||
|
"dcppc/2018-september-workshop",
|
||||||
|
"dcppc/design-guidelines",
|
||||||
|
"dcppc/2018-may-workshop"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
|
@@ -2,17 +2,18 @@
|
|||||||
INDEX_DIR = "search_index"
|
INDEX_DIR = "search_index"
|
||||||
|
|
||||||
# oauth client deets
|
# oauth client deets
|
||||||
GITHUB_OAUTH_CLIENT_ID = "63f8d49c651840cbe31e"
|
GITHUB_OAUTH_CLIENT_ID = "XXX"
|
||||||
GITHUB_OAUTH_CLIENT_SECRET = "36d9a4611f7427336d3c89ed041c45d086b793ee"
|
GITHUB_OAUTH_CLIENT_SECRET = "YYY"
|
||||||
|
GITHUB_TOKEN = "ZZZ"
|
||||||
|
|
||||||
# More information footer: Repository label
|
# More information footer: Repository label
|
||||||
FOOTER_REPO_ORG = "charlesreid1"
|
FOOTER_REPO_ORG = "dcppc"
|
||||||
FOOTER_REPO_NAME = "centillion"
|
FOOTER_REPO_NAME = "centillion"
|
||||||
|
|
||||||
# Toggle to show Whoosh parsed query
|
# Toggle to show Whoosh parsed query
|
||||||
SHOW_PARSED_QUERY=True
|
SHOW_PARSED_QUERY=True
|
||||||
|
|
||||||
TAGLINE = "Search all the things"
|
TAGLINE = "Search the Data Commons"
|
||||||
|
|
||||||
# Flask settings
|
# Flask settings
|
||||||
DEBUG = True
|
DEBUG = True
|
BIN
docs/images/cp.png
Normal file
BIN
docs/images/cp.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 498 KiB |
BIN
docs/images/ss.png
Normal file
BIN
docs/images/ss.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 355 KiB |
@@ -29,8 +29,7 @@ class GDrive(object):
|
|||||||
):
|
):
|
||||||
"""
|
"""
|
||||||
Set up the Google Drive API instance.
|
Set up the Google Drive API instance.
|
||||||
Factory method: create it and hand it over.
|
Factory method: create it here, hand it over in get_service().
|
||||||
Then we're finished.
|
|
||||||
"""
|
"""
|
||||||
self.credentials_file = credentials_file
|
self.credentials_file = credentials_file
|
||||||
self.client_secret_file = client_secret_file
|
self.client_secret_file = client_secret_file
|
||||||
@@ -40,6 +39,9 @@ class GDrive(object):
|
|||||||
self.store = file.Storage(credentials_file)
|
self.store = file.Storage(credentials_file)
|
||||||
|
|
||||||
def get_service(self):
|
def get_service(self):
|
||||||
|
"""
|
||||||
|
Return an instance of the Google Drive API service.
|
||||||
|
"""
|
||||||
|
|
||||||
creds = self.store.get()
|
creds = self.store.get()
|
||||||
if not creds or creds.invalid:
|
if not creds or creds.invalid:
|
||||||
|
BIN
img/ss.png
BIN
img/ss.png
Binary file not shown.
Before Width: | Height: | Size: 356 KiB |
Submodule mkdocs-material deleted from 6569122bb1
1
mkdocs-material-dib
Submodule
1
mkdocs-material-dib
Submodule
Submodule mkdocs-material-dib added at c3dd912f3c
@@ -1,5 +1,5 @@
|
|||||||
<!doctype html>
|
<!doctype html>
|
||||||
<title>Markdown Search</title>
|
<title>Centillion Search Engine</title>
|
||||||
<link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='style.css') }}">
|
<link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='style.css') }}">
|
||||||
<link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='github-markdown.css') }}">
|
<link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='github-markdown.css') }}">
|
||||||
<link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='bootstrap.min.css') }}">
|
<link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='bootstrap.min.css') }}">
|
||||||
|
@@ -107,12 +107,18 @@
|
|||||||
|
|
||||||
<div class="url">
|
<div class="url">
|
||||||
{% if e.kind=="gdoc" %}
|
{% if e.kind=="gdoc" %}
|
||||||
<b>Google Drive File:</b>
|
{% if e.mimetype=="document" %}
|
||||||
|
<b>Google Document:</b>
|
||||||
<a href='{{e.url}}'>{{e.title}}</a>
|
<a href='{{e.url}}'>{{e.title}}</a>
|
||||||
(Owner: {{e.owner_name}}, {{e.owner_email}})
|
(Type: {{e.mimetype}}, Owner: {{e.owner_name}}, {{e.owner_email}})
|
||||||
|
{% else %}
|
||||||
|
<b>Google Drive:</b>
|
||||||
|
<a href='{{e.url}}'>{{e.title}}</a>
|
||||||
|
(Type: {{e.mimetype}}, Owner: {{e.owner_name}}, {{e.owner_email}})
|
||||||
|
{% endif %}
|
||||||
|
|
||||||
{% elif e.kind=="issue" %}
|
{% elif e.kind=="issue" %}
|
||||||
<b>Issue:</b>
|
<b>Github Issue:</b>
|
||||||
<a href='{{e.url}}'>{{e.title}}</a>
|
<a href='{{e.url}}'>{{e.title}}</a>
|
||||||
{% if e.github_user %}
|
{% if e.github_user %}
|
||||||
opened by <a href='https://github.com/{{e.github_user}}'>@{{e.github_user}}</a>
|
opened by <a href='https://github.com/{{e.github_user}}'>@{{e.github_user}}</a>
|
||||||
@@ -121,7 +127,7 @@
|
|||||||
<b>Repository:</b> <a href='{{e.repo_url}}'>{{e.repo_name}}</a>
|
<b>Repository:</b> <a href='{{e.repo_url}}'>{{e.repo_name}}</a>
|
||||||
|
|
||||||
{% elif e.kind=="markdown" %}
|
{% elif e.kind=="markdown" %}
|
||||||
<b>Markdown:</b>
|
<b>Github Markdown:</b>
|
||||||
<a href='{{e.url}}'>{{e.title}}</a>
|
<a href='{{e.url}}'>{{e.title}}</a>
|
||||||
<br/>
|
<br/>
|
||||||
<b>Repository:</b> <a href='{{e.repo_url}}'>{{e.repo_name}}</a>
|
<b>Repository:</b> <a href='{{e.repo_url}}'>{{e.repo_name}}</a>
|
||||||
@@ -131,9 +137,15 @@
|
|||||||
|
|
||||||
{% endif %}
|
{% endif %}
|
||||||
<br />
|
<br />
|
||||||
score: {{'%d' % e.score}}
|
Score: {{'%d' % e.score}}
|
||||||
|
</div>
|
||||||
|
<div class="markdown-body">
|
||||||
|
{% if e.content_highlight %}
|
||||||
|
{{ e.content_highlight|safe}}
|
||||||
|
{% else %}
|
||||||
|
<p>(A preview of this document is not available.)</p>
|
||||||
|
{% endif %}
|
||||||
</div>
|
</div>
|
||||||
<div class="markdown-body">{{ e.content_highlight|safe}}</div>
|
|
||||||
|
|
||||||
</li>
|
</li>
|
||||||
{% endfor %}
|
{% endfor %}
|
||||||
|
Reference in New Issue
Block a user