User:Tom.Reding/JPL to Infobox planet
Appearance
What this code does
[edit]This C# AWB module is meant to facilitate, and reduce the error associated with, tediously transcribing data from JPL's Small-Body Database to most of WP's ~3,200 minor planet articles.
- General conformity: Generally tries to model itself on the lowest-numbered MPs, which are popular enough to be considered the consensus for what to display and how (after looking at a subset of 15-25 of them). If I have mischaracterized something, please bring it up on this page's talk page.
- Uses
{{Convert|value|AU|km/Gm/Tm}}
with|aphelion=
,(abbr=on)|semimajor=
,|perihelion=
,|moid=
, &|jupiter_moid=
. - Uses
{{val|value|error}}
when error is known for|rotation=
,|albedo=
,|mean_radius=
, &|abs_magnitude=
, otherwise uses the barevalue
. - If
|background=
!= 3-6-digit hexidecimal or does not exist, then make/add default|background=#FFFFC0
(see User:Rfassbind/sandbox/color-scheme).
- Uses
- Spacing conformity & standardization:
- Maintains the existing spacing format of the infobox before and after
|
and=
characters, based on most existing, non-empty parameters (up to 108 out of 114, and the most frequent usage wins). - If 3 or more spaces exist before the
=
, it's assumed that some form of=
-alignment is being used, so h-alignment is maintained. - 2 or more spaces after the
=
are assumed to be typos and are truncated to 1. - If one of the 3
|*_ref=
parameters start with 
, adds 
to the other 2, if they're not empty. - If
 
is used > 40% of the time that it could have been used between values and<ref>
/{{efn}}
s, then apply it 100% of the time, including headings (|*_ref=
s). - Unicode tabs are replaced with a single space prior to whitespace standardization.
- Maintains the existing spacing format of the infobox before and after
- Standardization/Formatting/Cleanup:
- Appends/adds a JPL ref to
|orbit_ref=
, if no ref names in that parameter contain "jpl|sbdb|orbit" (case insensitive) nor possess a JPL URL. If a "jpl" ref name doesn't exist in the article, nor a named ref with a JPL URL, adds a full ref named "jpldata" to|orbit_ref=
; otherwise, the 1st ref containing "jpl|sbdb|orbit" or JPL URL is used, in that order. - Fills in "bare" JPL & Lowell FTP refs anywhere.
- Replace unnamed {{Cite SBDB}} refs with master
<ref name="jpldata">
{{Cite web|...}}
, which is modeled after {{Cite SBDB}}. - Ref-wrap any bare URL in
|orbit_ref=
. - Removes deprecated parameters (listed below).
- Corrects {{Infobox planet}} aliases to
{{Infobox planet
. - Moves end-of-line
|
to the next line. - Moves parameters on the "{{Infobox planet" line to the next line.
- Moves multiple infobox-parameters (in-line citation-parameters are untouched) on the same line to multiple lines.
- If 1 OE parameter contains a symbol (
(Ω)
,(M)
, etc.) symbols will be added to/maintained on all other OE params. - Moves
|mean_radius=
under|dimensions=
for easier checking. - Moves text between last
<ref>
/{{efn}}
/{{Cite SBDB}}
and EndOfLine to before. - Moves lines starting with
<ref
or</ref
up to the end of the previous line. - Removes lines that don't start with
|
,{{
, or}}
, with/out leading whitespace. - Removes parameter-description comments from non-empty parameters.
- Removes redundant JPL link in ==External links==.
- Fix mislinks to List of minor planets: 1001–2000, etc.
- Remove text after last
<br>
if a reference exists before it, but not after it.- Remove text after
<br>
no matter what, for|period=
.
- Remove text after
- Appends/adds a JPL ref to
- Precision: All modified values are also rounded. Mimics the existing infobox precision if >= 5 digits; otherwise it defaults to 5-digit precision (similar to JPL's error-precision) or the average precision of existing parameters, whichever is higher, if possible. Exceptions:
|period=
: Year's precision is unmodified because it is rarely (if ever) more precise than 0.01 years (3.6525 days). Day's value, which seems to be derived from years, instead of vice versa, is truncated to 0-2 decimal places, inversely proportional to the size of year (i.e. a several hundred+ year period is given an integer day value). It's tempting to ascribe 4-decimal precision to days, but multiplying an uncertain value by a very certain (or exact) value does not bestow that certainty to the result. I.e. 0.0001 days leads the reader to assume an 8.64 second precision, which is actually swimming in a 3.65 day uncertainty.- JPL's MOID values are unmodified, due to not having any listed error values.
|mean_motion=
(new) uses {{Deg2DMS}} if < 1°.
- Uncertainty:
- Uncertainty (
±
) is not displayed for Orbital Elements (OE), because 1) the uncertainty is generally extremely small (MPs with low|observation_arc=
might be skipped in the future, or maybe uncertainties >= 10% will be shown, haven't decided yet (needs consensus, ideally)), and 2) OE uncertainties generally don't appear in the lowest #'d MPs (see 1. General Conformity, above). - When available, uncertainties are used, for:
|abs_magnitude=
&|albedo=
since MP size is very sensitive to these values, and|mean_radius=
,|dimensions=
, and|rotation=
. - Display uncertainty for OE if they're large (>= 1, or
|mean_motion=
error > 1/2 the value).
- Uncertainty (
- Updating access-date:
- Appends or replaces the
|access-date=
value with<Today's d MMMM yyyy>
in the jpl/sbdb/orbit 'master' ref (excludes {{JPL small body}} & {{MPCit JPL}}). - Adopts the local spacing convention when appending access-date.
- Appends or replaces the
- Accommodating existing values:
- Prepends JPL's value + JPL ref for
|abs_magnitude=
,|rotation=
,|albedo=
,|mean_radius=
,|dimensions=
if non-JPL refs exist in that parameter; otherwise updates the JPL value (assuming the ref name contains "jpl", "sbdb", or JPL URL, case insensitive). - Prefers to
<br />
-separate different values, if,
-separation doesn't exist.- Prefers
,
-separation for|abs_magnitude=
, since multiple values can easily render on 1 line.
- Prefers
- Prepends JPL's value + JPL ref for
- Skipping:
- Skips pages without an {{Infobox planet}} or alias (to do, low priority: allow code to build an infobox if none exist).
- Skips infoboxes that don't end with
}}
or|}}
on a separate line (~8% of MP infoboxes prior to fixing). - Skips pages with recently updated JPL references, based on
|access-date=
and user-defined month & year lists.
- New parameters: Appends new parameters to the bottom of the infobox, except
|minor_planet=yes
and|background=#FFFFC0
, which are inserted at the top, and|mean_radius=
, which is inserted/moved to below|dimensions=
. - Scope: The following 41 {{Infobox planet}} parameters are accounted for/operated on in some way (alphabetized, () = deprecated, * = fix spacing only):
|abs_magnitude=
|albedo=
|aphelion=
|arg_peri=
|asc_node=
|atmosphere_ref=
|background=
- (
|bgcolour=
) |caption=
*- (
|designations=
) - (
|diameter=
)never used |dimensions=
|discovered=
- (
|discovery=
) |discovery_ref=
|eccentricity=
|epoch=
|inclination=
|jupiter_moid=
|label_width=
|mass=
|mean_anomaly=
|mean_motion=
|mean_radius=
|minorplanet=
|moid=
|mp_name=
*|name=
*|named_after=
*|observation_arc=
|orbit_ref=
- (
|orbital_characteristics=
) |p_orbit_ref=
|perihelion=
|period=
- (
|physical_characteristics=
) |rotation=
|semimajor=
|tisserand=
|uncertainty=
- (
|width=
)
The settings file is essentially empty. The only boxes checked are Options > Apply general fixes, Skip > No changes made and Page is redirect, and Start > Minor edit.
What this code does not
[edit]This is not a "set it and forget it" script.
- It must be babysat.
- It's only meant to keep obscure, infrequently edited minor planets up-to-date (i.e. best for numbered MPs > 500-1000, and more care required for those < 500).
- Frequently edited pages with multiple-ref infoboxes might not play well with this script (but many do).
- Each modified parameter-line in the infobox must be examined so that no useful information is lost nor wikisyntax broken. This is due to the large number of display-variants, which the script attempts to account for, but exceptions will always exist and they need to be spotted.
- Most display-variants (mutiple
<ref>
s (<br>
-delimited or,
-delimited), multiple values with or without their measurement error, {{efn}}, {{small}}, {{val}}, {{convert}}) are accounted for for these parameters only (the most common offenders, and because you're not allowed to make custom classes in AWB):|abs_magnitude=
,|rotation=
,|albedo=
,|mean_radius=
,|dimensions=
.
- Most display-variants (mutiple
~ Tom.Reding (talk ⋅dgaf) 17:22, 18 March 2016 (UTC)
(Syntax highlighting is broken because code is too long)
public string ProcessArticle(string ArticleText, string ArticleTitle, int wikiNamespace, out string Summary, out bool Skip)
{
// global switches & vars ///////////////////////////////////////////////////
// all custom switches = true by default
bool skipMisspelledArticleTitles = true; // true to skip 16596 Stevenstrauss/Stephenstrauss, etc.
bool skipProvisionalJPLandWPMismatches = true; // some provisional asteroids don't show up in JPL
bool skipAlmostAllJPLandWPMismatches = true; // any mismatch other than diacritics
bool skipEntirePageIfAnyJPLUnitsDontMatchExpected = true;
bool skipIfRecentlyUpdated = true; // based on access-date in JPL ref
string SkipIfRecentlyUpdated_InMonth = @"February, March, April, May"; // full month names; case INsensitive; commas optional; if string is empty, match any month
string SkipIfRecentlyUpdated_InYear = @"2016";
Skip = false;
Summary = @"Update infobox with JPL data ([[User:Tom.Reding/JPL to Infobox planet|code]])";
string OriginalSummary = Summary; // used to catch exceptions & trigger skip-conditions in various sections
string ArticleTitle_latin = "dummy ArticleTitle_latin"; // need non-empty dummy declares here to make vars global
string JPL_DDate = "dummy JPL_DDate";
string Unique_ID = "dummy Unique_ID"; // either a # (for #+name and #+prelim MPs), or prelim (for prelim MPs)
string Param = "dummy Param";
string RefName_jpl = "dummy RefName_jpl";
string RefName_lowell = "dummy RefName_lowell";
string JPLRefNameVariants = @"jpl|sbdb|orbit"; // regex
string JPLDataRef_Master = "dummy JPLDataRef_Master";
string JPLDataRef_MasterOrSlave = "dummy JPLDataRef_MasterOrSlave";
string LowellRef_Master = "dummy LowellRef_Master";
string LowellRef_MasterOrSlave = "dummy LowellRef_MasterOrSlave";
bool jplDataMasterExists = false;
bool lowellMasterExists = false;
bool areSymbolsUsed = false; // "(Ω)", "(M)", etc.
string OldInfobox = "dummy OldInfobox";
string NewInfobox = "dummy NewInfobox";
// Infobox spacing variables (before & after "|" and "=", and  <ref>)
string S1, S2, S3, S4;
S1 = S2 = S3 = S4 = "";
string S12 = S1 + "|" + S2;
int EqualAlignTo = 0;
string ThinspRefs = ""; // populate with unicode thinsp, if necessary
string ThinspRefs_HTML = ""; // populate with HTML  , if necessary
string ThinspHeadings = "";
// append parameters to infobox if DNE
Regex rAppendToInfobox = new Regex(@"(.+[\r\n]+ *)\|? *(\}\}\s)", RegexOptions.Singleline);
// lists of parameters
List<string> FrequentlyMalformedParamList = new List<string>(new string[] { "name", "mp_name", "named_after", "background", "caption", "width" }); // 6
List<string> UpdatableParamList = new List<string>(new string[] { "abs_magnitude", "albedo", "aphelion", "arg_peri", "asc_node",
"bgcolour", "designations", "dimensions", "discovered", "eccentricity", "epoch", "inclination", "jupiter_moid", "mass", "mean_anomaly",
"mean_radius", "minorplanet", "moid", "observation_arc", "orbit_ref", "p_mean_motion", "perihelion", "period", "rotation",
"semimajor", "tisserand", "uncertainty" }); // 27 of the non-deprecated params this code updates (excludes FrequentlyMalformedParamList)
List<string> AlmostAllParamList = new List<string>(new string[] { "minorplanet", "extrasolarplanet", "symbol", "image", "image_alt",
"discovery_ref", "discoverer", "discovery_site", "discovered", "discovery_method", "designations", "pronounced", "alt_names",
"mp_category", "adjectives", "orbit_ref", "orbit_diagram", "epoch", "uncertainty", "observation_arc", "apsis", "aphelion",
"perihelion", "periastron", "apoastron", "periapsis", "apoapsis", "semimajor", "mean_orbit_radius", "eccentricity", "period",
"synodic_period", "avg_speed", "mean_anomaly", "inclination", "angular_dist", "asc_node", "long_periastron", "time_periastron",
"arg_peri", "semi-amplitude", "satellite_of", "satellites", "moid", "mercury_moid", "venus_moid", "mars_moid", "jupiter_moid",
"saturn_moid", "uranus_moid", "neptune_moid", "p_orbit_ref", "p_semimajor", "p_eccentricity", "p_inclination", "p_mean_motion",
"perihelion_rate", "node_rate", "dimensions", "mean_radius", "equatorial_radius", "polar_radius", "flattening", "circumference",
"surface_area", "volume", "mass", "density", "surface_grav", "moment_of_inertia_factor", "escape_velocity", "sidereal_day",
"rot_velocity", "rotation", "axial_tilt", "right_asc_north_pole", "declination", "pole_ecliptic_lat", "pole_ecliptic_lon",
"albedo", "single_temperature", "temp_name1", "min_temp_1", "mean_temp_1", "max_temp_1", "temp_name2", "min_temp_2",
"mean_temp_2", "max_temp_2", "temp_name3", "min_temp_3", "mean_temp_3", "max_temp_3", "temp_name4", "min_temp_4", "mean_temp_4",
"max_temp_4", "spectral_type", "family", "magnitude", "abs_magnitude", "angular_size", "atmosphere_ref", "surface_pressure",
"scale_height", "atmosphere_composition", "note", "tisserand", "label_width" }); // 109 non-deprecated params (excludes FrequentlyMalformedParamList)
List<string> AllParamList = AlmostAllParamList.Concat(FrequentlyMalformedParamList).ToList(); // All 114 non-deprecated params (109 + 6 = 115)
// auto-skip conditions /////////////////////////////////////////////////////
// no {{Infobox planet}} or variants
string InfoboxAliases_Pattern = @"Extrasolar Planet|Planet|Infobox (?:Moon|Nonstellarbody|Planet|asteroid|extrasolar planet|minor planet|small, medium[a-z, \(\)]+)";
string InfoboxText_Pattern = @"\{\{\s*(" + InfoboxAliases_Pattern + @")\s*\|.+?[\r\n]+ *\|? *\}\}\s";
Match mInfoboxExists = Regex.Match(ArticleText, @"\{\{\s*(" + InfoboxAliases_Pattern + @")\s*\|", RegexOptions.IgnoreCase);
if (mInfoboxExists.Success)
{
// Attempt to isolate all {{Infobox}} text for faster processing, shorter regex, & fewer mismatches.
// Infobox must end with "[\r\n]+ *\|? *}}\s".
Match mInfoboxText1 = Regex.Match(ArticleText, InfoboxText_Pattern, RegexOptions.IgnoreCase | RegexOptions.Singleline);
if (mInfoboxText1.Success)
{
// make sure infobox isolation didn't grab too much
string ArticleTitle_Pattern = @"([\r\n]+|\}\})[ ]*\'\'\'(" + ArticleTitle + @"|(\(\d+\) )?\{\{\s*mp\s*\|[^\}]+\}\})\'\'\'"; // typically designates the start of body-text
Match mArticleTitle = Regex.Match(mInfoboxText1.Value, ArticleTitle_Pattern, RegexOptions.IgnoreCase);
if (mArticleTitle.Success) Summary = @"{{Infobox}} doesn't end with ""}}"" on a separate line."; // skip
// sterilize to "{{Infobox planet"
NewInfobox = OldInfobox = mInfoboxText1.Value;
NewInfobox = Regex.Replace(NewInfobox, @"\{\{\s*(" + InfoboxAliases_Pattern + ")", @"{{Infobox planet", RegexOptions.IgnoreCase);
}
else if (!mInfoboxText1.Success)
Summary = @"{{Infobox}} doesn't end with ""}}"" on a separate line.";
}
else if (!mInfoboxExists.Success)
Summary = @"No {{Infobox planet}} or alias.";
if (Summary != OriginalSummary) // summary skip-condition
Skip = true;
// use ArticleTitle to determine Unique_ID for URL //////////////////////////
// rep non-keyboard apostrophes/hyphens (even MPC's diacritized list only uses keyboard apostrophes for all but 1 MP)
ArticleTitle_latin = Regex.Replace(ArticleTitle, @"[‐–—-]", "-"); // i.e. WP's 6344 P–L == JPL's 6344 P-L
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"['‘’´`]", "'"); // EXCEPT 7613 `akikiki (WP == JPL == MPC)
// ArticleTitle_latin = <prelim>?
Match mIsPrelim_WP = Regex.Match(ArticleTitle_latin, @"^\d\d\d\d [A-Z][A-Z-][L\d-]{0,4}$");
if ( mIsPrelim_WP.Success) Unique_ID = ArticleTitle_latin;
if (!mIsPrelim_WP.Success)
{
// ArticleTitle_latin = "(#) <prelim>"?
string NumPrelimPattern_WP = @"\((\d+)\) (\d\d\d\d [A-Z][A-Z-][L\d-]{0,4})$";
Match mIsNumPrelim_WP = Regex.Match(ArticleTitle_latin, NumPrelimPattern_WP);
if (mIsNumPrelim_WP.Success) Unique_ID = mIsNumPrelim_WP.Groups[1].Value;
if (!mIsNumPrelim_WP.Success)
{
// ArticleTitle_latin = "# <name>"?
string NumNamePattern_WP = @"^(\d+) ([\w\s\.'-]+)";
Match mIsNumName_WP = Regex.Match(ArticleTitle_latin, NumNamePattern_WP);
if ( mIsNumName_WP.Success) Unique_ID = mIsNumName_WP.Groups[1].Value;
if (!mIsNumName_WP.Success)
{
// if we get here, something's wrong
Summary = @"Bad article name? Couldn't determine if ArticleTitle_latin = prelim, (#)+prelim, or #+name...";
Skip = true;
}
}
}
// grab JPL HTML info ///////////////////////////////////////////////////////
// get JPL page text
string JPL_URL = @"http://ssd.jpl.nasa.gov/sbdb.cgi?sstr=" + Unique_ID.Replace(" ", "");
string JPL_ExternalText = Tools.GetHTML(JPL_URL);
string JPL_DiscoveryDatePattern = @"Discovered (\d\d\d\d)[\s‐–—-]*(\w+)\.?[\s‐–—-]*(\d\d?)";
Match mJPL_DiscoveryDate = Regex.Match(JPL_ExternalText, JPL_DiscoveryDatePattern);
string JPL_Designation_Pattern = @"1""\>\<b\>([^\<]+)\</b\>\</font\>\</td\>\</tr\>";
string JPL_Designation = Regex.Match(JPL_ExternalText, JPL_Designation_Pattern).Groups[1].Value;
// Skip if JPL != ArticleTitle //////////////////////////////////////////////
// define 'regional' variables
string MPprelim = "dummy MPprelim";
string MPnumber = "dummy MPnumber";
string MPname = "dummy MPname";
// JPL_Designation = ArticleTitle = "<prelim>"?
Match mIsPrelim_JPL = Regex.Match(JPL_Designation, @"^\((\d\d\d\d [A-Z][A-Z‐–—-][L\d‐–—-]{0,4})\)$");
if (mIsPrelim_JPL.Success && !Skip)
{
MPname = "";
MPnumber = "";
MPprelim = mIsPrelim_JPL.Groups[1].Value;
// ArticleTitle_latin only has [‐–—-]['‘’´`] -> [-]['] at this point
if (skipProvisionalJPLandWPMismatches && ArticleTitle_latin != MPprelim)
{
Summary = @"ArticleTitle_latin != JPL's prelim designation";
Skip = true;
}
}
else if (!mIsPrelim_JPL.Success && !Skip)
{
// JPL_Designation = "# (<prelim>)" = ArticleTitle = "(#) <prelim>"?
string NumPrelimPattern_JPL = @"^(\d+) \((\d\d\d\d [A-Z][A-Z‐–—-][L\d‐–—-]{0,4})\)$";
MPname = "";
MPnumber = "";
MPprelim = "";
Match mIsNumPrelim_JPL = Regex.Match(JPL_Designation, NumPrelimPattern_JPL);
if (mIsNumPrelim_JPL.Success)
{
MPname = "";
MPnumber = mIsNumPrelim_JPL.Groups[1].Value;
MPprelim = mIsNumPrelim_JPL.Groups[2].Value;
if (ArticleTitle_latin != "(" + MPnumber + ") " + MPprelim)
{
Summary = @"ArticleTitle_latin != JPL's #+prelim";
Skip = true;
}
}
else if (!mIsNumPrelim_JPL.Success)
{
// JPL_Designation = "# <name> (<prelim>)"
// JPL only starts including "(<prelim>)" AFTER ~332...It appears much less frequently <= 332.
// Presumably, b/c the prelims for MP#s <= 332 weren't officially designated (b/c discovered before the 1890s).
string NumNamePattern_JPL = @"^(\d+) ([\w\s\.'‘’´`‐–—-]+)(\s\()?";
Match mIsNumName_JPL = Regex.Match(JPL_Designation, NumNamePattern_JPL);
if (mIsNumName_JPL.Success)
{
MPname = mIsNumName_JPL.Groups[2].Value.Trim();
MPnumber = mIsNumName_JPL.Groups[1].Value.Trim();
MPprelim = "";
// remove ArticleTitle diacritics to check JPL name, which doesn't use diacritics nor non-keyboard apostrophes (except 7613 `akikiki...)
// remove lowercase diacritics from ArticleTitle (have to specially code for this since CharUnicodeInfo isn't available, it seems)
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[áàȧâäǎăāãåąⱥấầắằǡǻǟẫẵảȁȃẩẳạḁậặɑ̀āаⱥẚ]", "a");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ḃƀɓḅḇƃбᶀɓƀ]", "b");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ćċĉčçȼḉƈɔсȼƈȼƈ]", "c");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ḋďḑđƌɗḍḓḏðɖɗɖᶁƌȡɗɖ]", "d");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[éèėêëěĕēẽęȩɇếềḗḕễḝẻȅȇểẹḙḛệɛəёеɇɇǝ]", "e");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ḟƒᶂ]", "f");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ǵġĝǧğḡģǥɠɣɠǥᶃ]", "g");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ḣĥḧȟḩħḥḫẖⱨħɦ]", "h");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ıíìîïǐĭīĩįɨḯỉȉȋịḭɩіɨ]", "i");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ĵǰɉ]", "j");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ḱǩķƙḳḵⱪĸкƙᶄⱪ]", "k");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ĺľɫⱡļƚłḷḽḻḹŀлłᶅƚȴⱡ]", "l");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ḿṁɱṃмᶆ]", "m");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ńǹṅňñņɲƞṇṋṉŋʼnƞȵᶇŋɲ]", "n");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[óòȯôöǒŏōõǫőốồɵøṓṑṍȫỗṏǿȭǭỏȍȏơổọớờỡộởợоøɵ]", "o");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ṕṗᵽƥᶈƥᵽ]", "p");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ɋʠ]", "q");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ŕṙřŗɍɽȑȓṛṟṝɍᶉ]", "r");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[śṡŝšṥṧṣșȿṩſᶊȿ]", "s");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ṫẗťţƭṭʈțṱṯⱦŧƭŧƫȶʈⱦ]", "t");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[úùûüǔŭūũůųűʉǘǜṹṻủȕȗưụṳứừṷṵữʊửựʉ]", "u");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ṽṿʋʋⱴ]", "v");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ẃẁẇŵẅẘẉⱳⱳ]", "w");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ẋẍхᶍ]", "x");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ýỳẏŷÿȳỹẙɏỷƴỵӳɏƴᶌ]", "y");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[źżẑžƶȥẓɀẕⱬƶᶎʐⱬȥɀ]", "z");
// remove uppercase diacritics (like for 4901 Ó Briain)
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ÁÀȧÂÄǍĂĀÃÅĄⱥẤẦẮẰǠǺǞẪẴẢȀȂẨẲẠḀẬẶɑ̀ĀАȺ]", "A");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ḂƀƁḄḆƂɃƁ]", "B");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ĆĊĈČÇȼḈƇƆСȻƇ]", "C");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ḊĎḐĐƋƊḌḒḎÐƉƊƉƋ]", "D");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ÉÈĖÊËĚĔĒẼĘȩɇẾỀḖḔỄḜẺȄȆỂẸḘḚỆƐƏЁЕƎƐɆ]", "E");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ḞƑƑ]", "F");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ǴĠĜǦĞḠĢǤƓƔƓǤ]", "G");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ḢĤḦȟḨĦḤḪẖĦⱧ]", "H");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[IÍÌÎÏǏĬĪĨĮƗḮỈȈȊỊḬƖІƗ]", "I");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ĴǰɉɈ]", "J");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ḰǨĶƘḲḴКƘⱩ]", "K");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ĹĽɫⱡĻƚŁḶḼḺḸĿŁȽⱠⱢ]", "L");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ḾṀɱṂМ]", "M");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ŃǹṄŇÑŅƝƞṆṊṈŊƝȠŊ]", "N");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ÓÒȯÔÖǑŎŌÕǪŐỐỒƟØṒṐṌȫỖṎǾȭǬỎȌȎƠỔỌỚỜỠỘỞỢОƟØ]", "O");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ṔṖᵽƤⱣƤ]", "P");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ɋ]", "Q");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ŔṘŘŖɍɽȐȒṚṞṜɌⱤ]", "R");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ŚṠŜŠṤṦṢșȿṨ]", "S");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ṪẗŤŢƬṬƮțṰṮⱦŦŦƬƮȾ]", "T");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ÚÙÛÜǓŬŪŨŮŲŰʉǗǛṸṺỦȔȖƯỤṲỨỪṶṴỮƱỬỰɄ]", "U");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ṼṾƲ]", "V");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ẂẀẆŴẄẘẈⱳƜⱲ]", "W");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ẊẌХ]", "X");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ÝỲẎŶŸȳỸẙɏỶƳỴӲƳɎ]", "Y");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ŹŻẐŽƵȥẒɀẔⱬƵⱫȤ]", "Z");
// other, from https://sourceforge.net/p/autowikibrowser/code/HEAD/tree/AWB/WikiFunctions/Tools.cs#l1181
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ÆǢǼ]", "Ae");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[æǣǽ]", "ae");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[DžDz]", "Dz");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[fl]", "fl");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[Lj]", "Lj");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[lj]", "lg");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[Nj]", "Nj");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[№]", "No");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[nj]", "ng");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[Œ]", "Oe");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[œ]", "oe");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[ß]", "ss");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[Þ]", "Th");
ArticleTitle_latin = Regex.Replace(ArticleTitle_latin, @"[þ]", "th");
string JPL_MPnumname = MPnumber + " " + MPname;
if (ArticleTitle_latin != JPL_MPnumname)
{
if (JPL_Designation != @"7613 `akikiki (1996 DK)" && skipMisspelledArticleTitles)
{
Summary = @"ArticleTitle_latin != JPL's #+name designation.";
Skip = true;
}
}
}
// if we get here, something's wrong
// unless it's 7613 `akikiki (1996 DK), where JPL decides NOT to use a keyboard single quote
if (skipProvisionalJPLandWPMismatches && Unique_ID == ArticleTitle && !mIsNumName_JPL.Success)
{
if (Regex.Match(ArticleText, @"\[\[Category:Lost minor planets").Success)
Summary = @"Lost minor planet, and mismatch b/w JPL & WP";
else
Summary = @"Bad article name? Provisional name mismatch b/w JPL & WP.";
Skip = true;
}
else if (skipAlmostAllJPLandWPMismatches && !mIsNumName_JPL.Success)
{
if (JPL_Designation != "7613 `akikiki (1996 DK)")
{
Summary = @"Bad article name? Couldn't determine if JPL_Designation = prelim, (#)+prelim, or #+name...";
Skip = true;
}
}
} // end else if(!mIsNumPrelim_JPL.Success)
} // end else if(!mIsPrelim_JPL.Success)
// clean/combine inputs /////////////////////////////////////////////////////
// define 'regional' variables
string JPL_DDay = "dummy JPL_DDay";
string JPL_DMonth = "dummy JPL_DMonth";
string JPL_DYear = "dummy JPL_DYear";
if (mJPL_DiscoveryDate.Success && !Skip)
{
// leave days without 0 padding
JPL_DDay = mJPL_DiscoveryDate.Groups[3].Value.TrimStart('0');
// convert month to full name
JPL_DMonth = mJPL_DiscoveryDate.Groups[2].Value;
switch (JPL_DMonth)
{
case "Jan": JPL_DMonth="January"; break;
case "Feb": JPL_DMonth="February"; break;
case "Mar": JPL_DMonth="March"; break;
case "Apr": JPL_DMonth="April"; break;
case "May": JPL_DMonth="May"; break;
case "June": // exists
case "Jun": JPL_DMonth="June"; break;
case "July": // exists
case "Jul": JPL_DMonth="July"; break;
case "Aug": JPL_DMonth="August"; break;
case "Sept": // exists?
case "Sep": JPL_DMonth="September"; break;
case "Oct": JPL_DMonth="October"; break;
case "Nov": JPL_DMonth="November"; break;
case "Dec": JPL_DMonth="December"; break;
default:
Summary = @"Month: '" + JPL_DMonth + "' not found!";
Skip = true;
break;
}
JPL_DYear = mJPL_DiscoveryDate.Groups[1].Value;
// combine as DMY
JPL_DDate = JPL_DDay + " " + JPL_DMonth + " " + JPL_DYear;
}
else JPL_DDate = "";
// Preparatory infobox-only searches & changes //////////////////////////////
// replace whitespace-b/w-values-&-refs with  
if (!Skip)
{
// \b[a-z]{1,5} to cover most units & ignore longer words
string ValRef_Pattern = @"(\{\{(?:Convert|Val|Small|Deg2DMS|Deg2HMS|e\|)[^\}]+\}\}|\]\]|\d+|\b[a-z]{1,5}|met[er][re]s?)(\s+)(\<ref|\{\{efn|\{\{cite sbdb)";
Match mValRef = Regex.Match(NewInfobox, ValRef_Pattern, RegexOptions.IgnoreCase);
if (mValRef.Success) NewInfobox = Regex.Replace(NewInfobox, ValRef_Pattern, @"$1 $3", RegexOptions.IgnoreCase);
}
// move lines starting w "<ref" or "</ref" up 1 line
string RefLine_Pattern = "[\r\n]+" + @"(\s*\</?ref)";
Match mRefLine = Regex.Match(NewInfobox, RefLine_Pattern, RegexOptions.IgnoreCase);
if (mRefLine.Success) NewInfobox = Regex.Replace(NewInfobox, RefLine_Pattern, @"$1", RegexOptions.IgnoreCase);
// unicodify " "s to " " in infobox ONLY
NewInfobox = NewInfobox.Replace(@" ", " "); // unicodify during processing; DEunicodify back to html at the end
// determine the   usage fraction and enforce   usage later if > threshold %
string ThinspCount_Pattern = @"(\{\{(?:Convert|Val|Small|Deg2DMS|Deg2HMS|e\|)[^\}]+\}\}|\]\]|\)|\d+|['""a-z]|)( | )(\<ref|\{\{efn)"; // thinsp required; [a-z] to cover all possible units
double ThinspCount = Regex.Matches(NewInfobox, ThinspCount_Pattern, RegexOptions.IgnoreCase).Count;
string MaxPotentialThinsps_Pattern = @"(\{\{(?:Convert|Val|Small|Deg2DMS|Deg2HMS|e\|)[^\}]+\}\}|\]\]|\)|\d+|['""a-z])( | )?(\<ref|\{\{efn)"; // thinsp optional
double MaxPotentialThinsps = Regex.Matches(NewInfobox, MaxPotentialThinsps_Pattern, RegexOptions.IgnoreCase).Count;
double ThinspUsageFraction = ThinspCount / MaxPotentialThinsps;
if (ThinspUsageFraction > 0.4) // if > 40%, make it a thing (100% usage ratio)
{
ThinspRefs = " ";
ThinspRefs_HTML = " ";
}
// determine   usage in headings (different from usage between values & refs)
string ThinspHeadings_Pattern = @"\|\s*[\w_]+_ref\s*= *( | )([^\r\n]{5,})"; // orbit_ref, etc.; ">= 5" since "<ref>" = 5 characters long
int ThinspHeadingsCount = Regex.Matches(NewInfobox, ThinspHeadings_Pattern, RegexOptions.IgnoreCase).Count;
if (ThinspHeadingsCount > 0 | !string.IsNullOrEmpty(ThinspRefs)) ThinspHeadings = " ";
// rem text selectively after last <br>
foreach (string param in UpdatableParamList)
{
// remove "<br>([[Uncertainty Parameter U|Uncertainty]]=8)<ref name=jpldata/>" and the like
string BR_Pattern = @"(\|\s*" + param + @"\s*=[^\r\n\<]*?)\<br */?\>[^\r\n]*?Uncertainty Parameter[^\r\n]*";
Match mBR = Regex.Match(NewInfobox, BR_Pattern, RegexOptions.IgnoreCase);
if (mBR.Success) NewInfobox = Regex.Replace(NewInfobox, BR_Pattern, @"$1", RegexOptions.IgnoreCase);
// remove last "<br>" & subsequent in-line IB text IF a ref/efn/cite sbdb doesn't exist after on that line AND one does exist before the last <br>
BR_Pattern = @"(\|\s*" + param + @"\s*=[^\r\n]*)\<br */?\>(?![^\r\n]*(?:\<ref|\{\{efn|\{\{cite sbdb))";
mBR = Regex.Match(NewInfobox, BR_Pattern, RegexOptions.IgnoreCase);
if (mBR.Success)
{
Match mRef = Regex.Match(mBR.Groups[1].Value, @"(\<ref|\{\{efn|\{\{cite sbdb)", RegexOptions.IgnoreCase); // i.e. to exclude 705 Erminia's dimensions param
if (mRef.Success) NewInfobox = Regex.Replace(NewInfobox, BR_Pattern, @"$1", RegexOptions.IgnoreCase);
}
// remove last "<br>" & subsequent in-line IB text, no matter what, for specific params
if (param == "period")
{
BR_Pattern = @"(\|\s*" + param + @"\s*=[^\r\n]*)\<br */?\>[^\r\n]*";
mBR = Regex.Match(NewInfobox, BR_Pattern, RegexOptions.IgnoreCase);
if (mBR.Success) NewInfobox = Regex.Replace(NewInfobox, BR_Pattern, @"$1", RegexOptions.IgnoreCase);
}
}
// remove "<ref>Osculating Orbital Elements</ref>" and the like from infobox
string LatinOnlyRef_Pattern = @"\<ref\>[\sa-z]+\</ref\>";
Match mLatinOnlyRef = Regex.Match(NewInfobox, LatinOnlyRef_Pattern, RegexOptions.IgnoreCase);
if (mLatinOnlyRef.Success) NewInfobox = Regex.Replace(NewInfobox, LatinOnlyRef_Pattern, "", RegexOptions.IgnoreCase);
// remove "<sup>o</sup>" and the like from infobox
string UnnecessarySups_Pattern = @"\<sup\>[Oo°]\</sup\>[\)\]]?";
Match mUnnecessarySups = Regex.Match(NewInfobox, UnnecessarySups_Pattern, RegexOptions.IgnoreCase);
if (mUnnecessarySups.Success) NewInfobox = Regex.Replace(NewInfobox, UnnecessarySups_Pattern, "", RegexOptions.IgnoreCase);
// replace "([[IRAS]])" -> {{small|([[IRAS]])}}
string BareIRAS_Pattern = @"(?<!small\s*\|\s*)(\(\s*)?\[\[\s*IRAS\s*\]\]\s*(\:?\d?\)?)";
Match mBareIRAS = Regex.Match(NewInfobox, BareIRAS_Pattern, RegexOptions.IgnoreCase);
if (mBareIRAS.Success) NewInfobox = Regex.Replace(NewInfobox, BareIRAS_Pattern, @"{{small|$1[[IRAS]]$2}}", RegexOptions.IgnoreCase);
// Preparatory ArticleText-only searches & changes //////////////////////////
// don't isolate Infobox text until the next section
ArticleText = ArticleText.Replace(OldInfobox, NewInfobox); // save infobox changes
OldInfobox = NewInfobox;
// fix typo "JP: Small-body" -> "JPL Small-body" and the like in ArticleText
string Typo_Pattern = @"JP[^L][ ]+(Small)[ ‐–—-]*(body)";
Match mTypo = Regex.Match(ArticleText, Typo_Pattern, RegexOptions.IgnoreCase);
if (mTypo.Success) ArticleText = Regex.Replace(ArticleText, Typo_Pattern, @"JPL $1-$2", RegexOptions.IgnoreCase);
// Search ArticleText for 1) ref name="jpldata", 2) "jpl", 3) "*jpl|sbdb|orbit*", then 4) ANY named ref with ssd.jpl.nasa.gov/sbdb.cgi.
// Put whichever wins first into "RefName_jpl".
string RefName_jpl_Pattern = @"\<ref name\s*=\s*(""?jpldata""?)\s*\>\s*(\{\{)?"; // #1
Match mRefName_jpl = Regex.Match(ArticleText, RefName_jpl_Pattern, RegexOptions.IgnoreCase);
if (!mRefName_jpl.Success)
{
RefName_jpl_Pattern = @"\<ref name\s*=\s*(""?jpl""?)\s*\>\s*(\{\{)?"; // #2
mRefName_jpl = Regex.Match(ArticleText, RefName_jpl_Pattern, RegexOptions.IgnoreCase);
if (!mRefName_jpl.Success)
{
RefName_jpl_Pattern = @"\<ref name\s*=\s*([^\>]*(?:" + JPLRefNameVariants + @")[^/\>]*?)\s*\>\s*(\{\{)?"; // #3
mRefName_jpl = Regex.Match(ArticleText, RefName_jpl_Pattern, RegexOptions.IgnoreCase);
if (!mRefName_jpl.Success)
{
RefName_jpl_Pattern = @"\<ref name\s*=\s*([^\>]*?)\s*(?<!/)\>(\s*\{\{)?[^\<]*ssd\.jpl\.nasa\.gov/sbdb\.cgi[^\<]+?\</ref\>"; // #4
mRefName_jpl = Regex.Match(ArticleText, RefName_jpl_Pattern, RegexOptions.IgnoreCase);
if (mRefName_jpl.Success) JPLRefNameVariants += @"|" + mRefName_jpl.Groups[1].Value.Trim(); // add new found name to JPLRefNameVariants
}
}
}
if ( mRefName_jpl.Success) RefName_jpl = mRefName_jpl.Groups[1].Value.Trim(); // one of them exists; assume it's good
if (!mRefName_jpl.Success) RefName_jpl = @"""jpldata"""; // none exist; build master named "jpldata"
JPLDataRef_Master = @"<ref name=" + RefName_jpl + @">{{Cite web" + // modeled after {{Cite SBDB}}
@" |url=" + JPL_URL + @";cad=1" + // cad=1 to show Close Approach Data, if available
@" |title=" + JPL_Designation +
@" |work=[[JPL Small-Body Database]]" +
@" |publisher=[[NASA]]/[[Jet Propulsion Laboratory]]" +
@" |access-date=" + DateTime.Today.ToString("d MMMM yyyy") +
@"}}</ref>"; // entire master ref is rendered on 1 line
string JPLDataRef_Slave = @"<ref name=" + RefName_jpl + @" />";
// Search ArticleText for ref name="*lowell*", "*astorb*", then "*AOED*". Put whichever wins first into "RefName_lowell".
string RefName_lowell_Pattern = @"\<ref name\s*=\s*([^\>]*lowell[^/\>]*?)\s*\>\s*(\{\{)?";
Match mRefName_lowell = Regex.Match(ArticleText, RefName_lowell_Pattern, RegexOptions.IgnoreCase);
if (!mRefName_lowell.Success)
{
RefName_lowell_Pattern = @"\<ref name\s*=\s*([^\>]*astorb[^/\>]*?)\s*\>\s*(\{\{)?";
mRefName_lowell = Regex.Match(ArticleText, RefName_lowell_Pattern, RegexOptions.IgnoreCase);
if (!mRefName_lowell.Success)
{
RefName_lowell_Pattern = @"\<ref name\s*=\s*([^\>]*AOED[^/\>]*?)\s*\>\s*(\{\{)?";
mRefName_lowell = Regex.Match(ArticleText, RefName_lowell_Pattern, RegexOptions.IgnoreCase);
}
}
if ( mRefName_lowell.Success) RefName_lowell = mRefName_lowell.Groups[1].Value; // one of them exists; assume it's good
if (!mRefName_lowell.Success) RefName_lowell = @"""lowell"""; // none exist; build master named "lowell"
string Lowell_URL = @"ftp://ftp.lowell.edu/pub/elgb/astorb.html";
LowellRef_Master = @"<ref name=" + RefName_lowell + @">{{Cite web" +
@" |url=" + Lowell_URL +
@" |website=astorb" +
@" |title=The Asteroid Orbital Elements Database" +
@" |publisher=[[Lowell Observatory]]" +
@"}}</ref>"; // entire master ref is rendered on 1 line
string LowellRef_Slave = @"<ref name=" + RefName_lowell + @" />";
// determine if JPL & Lowell master refs probably exist (ref name + "{{" are a good first sign)
if (!string.IsNullOrEmpty(mRefName_jpl.Groups[2].Value)) jplDataMasterExists = true;
if (!string.IsNullOrEmpty(mRefName_lowell.Groups[2].Value)) lowellMasterExists = true;
// check original JPL ref's access-date and skip if updated too recently (user-defined)
string JPLRef_Pattern = @"\<ref(?: name\s*=\s*" + RefName_jpl + @"\s*|[^\>]*)?\>" +
@"[^\<]*(Cite SBDB[^\<]*access\-?date\s*=|" +
@"access\-?date\s*=[^\<]*ssd\.jpl\.nasa\.gov/sbdb\.cgi|" +
@"ssd\.jpl\.nasa\.gov/sbdb\.cgi[^\<]*access\-?date\s*=)[^\<]*\</ref\>";
Match mJPLRef = Regex.Match(ArticleText, JPLRef_Pattern, RegexOptions.IgnoreCase);
if (mJPLRef.Success && skipIfRecentlyUpdated && !Skip)
{
string JPLRef = mJPLRef.Value;
SkipIfRecentlyUpdated_InMonth = SkipIfRecentlyUpdated_InMonth.ToUpper();
string AccessDate_Pattern = @"\|\s*access-?date\s*=\s*([^\|\}\<]*)";
Match mAccessDate = Regex.Match(JPLRef, AccessDate_Pattern, RegexOptions.IgnoreCase);
string AccessDate = mAccessDate.Groups[1].Value.Trim(); // don't ToUpper() so it displays as-written in edit summary
if (!string.IsNullOrEmpty(AccessDate))
{
int UpdatedRecently = -1;
string AccessDate_Month = "";
string AccessDate_Year = "";
Match mAccessDate_Year = Regex.Match(AccessDate, @"\d\d\d\d");
Match mAccessDate_MMM = Regex.Match(AccessDate, @"[a-z]{3,}", RegexOptions.IgnoreCase);
if (mAccessDate_MMM.Success && mAccessDate_Year.Success) // MMM or MMMM
{
AccessDate_Month = mAccessDate_MMM.Value.ToUpper();
AccessDate_Year = mAccessDate_Year.Value;
}
else if (!mAccessDate_MMM.Success && mAccessDate_Year.Success) // YMD, DMY, MDY
{
Match mYMD = Regex.Match(AccessDate, @"(\d\d\d\d)[ -]+(\d\d)[ -]+(\d\d)");
if (mYMD.Success) // if ####-##-##, assume yyyy-mm-dd
{
AccessDate_Year = mYMD.Groups[1].Value;
AccessDate_Month = mYMD.Groups[2].Value;
switch (AccessDate_Month)
{
case "1": case "01": AccessDate_Month="JANUARY"; break;
case "2": case "02": AccessDate_Month="FEBRUARY"; break;
case "3": case "03": AccessDate_Month="MARCH"; break;
case "4": case "04": AccessDate_Month="APRIL"; break;
case "5": case "05": AccessDate_Month="MAY"; break;
case "6": case "06": AccessDate_Month="JUNE"; break;
case "7": case "07": AccessDate_Month="JULY"; break;
case "8": case "08": AccessDate_Month="AUGUST"; break;
case "9": case "09": AccessDate_Month="SEPTEMBER"; break;
case "10": AccessDate_Month="OCTOBER"; break;
case "11": AccessDate_Month="NOVEMBER"; break;
case "12": AccessDate_Month="DECEMBER"; break;
default:
Summary += @"; JPL's accessdate (" + mYMD.Value + @") month not found (" + AccessDate_Month + @")";
Skip = true;
break;
}
}
else if (!mYMD.Success) ; // don't feel like checking for DMY/MDY access-dates
}
if (SkipIfRecentlyUpdated_InMonth.Length >= 3)
{
if (SkipIfRecentlyUpdated_InMonth.IndexOf(AccessDate_Month) > -1 &&
SkipIfRecentlyUpdated_InYear.IndexOf( AccessDate_Year) > -1)
UpdatedRecently = 1;
}
else if (string.IsNullOrEmpty(SkipIfRecentlyUpdated_InMonth))
UpdatedRecently = SkipIfRecentlyUpdated_InYear.IndexOf(AccessDate_Year);
if (UpdatedRecently >= 0)
{
Summary = @"MP info updated recently (" + AccessDate + @")";
Skip = true;
}
}
}
// replace existing, bare-link, lowell-refs with lowell master and/or slaves
int FilledIn = 0;
bool loop = true;
while (loop && !Skip)
{
// determine which to place
LowellRef_MasterOrSlave = (lowellMasterExists) ? LowellRef_Slave : LowellRef_Master;
string BareRef_Pattern = @"\<ref\s*(name\s*=\s*" + RefName_lowell + @"\s*)?\>\s*\[.+?ftp\.lowell\.edu.+?astorb.+?\]\</ref\>";
Match mBareRef = Regex.Match(ArticleText, BareRef_Pattern, RegexOptions.Singleline);
// You can't overload a 1-line "Regex.Replace" to only replace the 1st match, but you CAN overload a "new Regex". Thanks, C#.
Regex rBareRef = new Regex(BareRef_Pattern);
if (mBareRef.Success)
{
ArticleText = rBareRef.Replace(ArticleText, LowellRef_MasterOrSlave, 1);
FilledIn++;
lowellMasterExists = true; // if not true before, it's definitely true now
LowellRef_MasterOrSlave = LowellRef_Slave; // force slave after first placement, in case master was placed
}
else loop = false;
if (FilledIn > 50) loop = false; // to prevent an infinite loop
if (!loop && FilledIn > 0)
{
string Suffix = (FilledIn > 1) ? "s" : "";
if (!string.IsNullOrEmpty(mBareRef.Groups[1].Value)) Summary += @"; filled in " + FilledIn + @" bare <ref name=" + RefName_lowell + @">";
else Summary += @"; filled in " + FilledIn + @" bare [ftp.lowell.edu] ref" + Suffix;
}
}
// replace existing, bare-link, jpl-refs with jpl master and/or slaves
FilledIn = 0;
loop = true;
while (loop && !Skip)
{
// determine which to place
JPLDataRef_MasterOrSlave = (jplDataMasterExists) ? JPLDataRef_Slave : JPLDataRef_Master;
string BareRef_Pattern = @"\<ref\s*(name\s*=\s*" + RefName_jpl + @"\s*)?\>\s*\[[^\<]+?ssd\.jpl\.nasa\.gov/sbdb\.cgi[^\<]+?\</ref\>";
Match mBareRef = Regex.Match(ArticleText, BareRef_Pattern, RegexOptions.Singleline);
Regex rBareRef = new Regex(BareRef_Pattern); // req'd for replacing 1 match at a time, otherwise end up with multiple masters
if (mBareRef.Success)
{
ArticleText = rBareRef.Replace(ArticleText, JPLDataRef_MasterOrSlave, 1);
FilledIn++;
jplDataMasterExists = true; // if not true before, it's definitely true now
JPLDataRef_MasterOrSlave = JPLDataRef_Slave; // force slave after first placement, in case master was placed
}
else loop = false;
if (FilledIn > 50) loop = false; // to prevent an infinite loop
if (!loop && FilledIn > 0)
{
string Suffix = (FilledIn > 1) ? "s" : "";
if (!string.IsNullOrEmpty(mBareRef.Groups[1].Value)) Summary += @"; filled in " + FilledIn + @" bare <ref name=" + RefName_jpl + @">";
else Summary += @"; filled in " + FilledIn + @" bare [ssd.jpl.nasa.gov] ref" + Suffix;
}
}
// ref-wrap any remaining bare links in |orbit_ref= (to do: expand this to all ArticleText?)
string OrbitRefBareLink_Pattern = @"(\|\s*orbit_ref\s*=\s*)(\[+ *(?:ftp|http|www)[^\]]+\]+)"; // "\s*" since malformed params not fixed yet
Match mOrbitRefBareLink = Regex.Match(ArticleText, OrbitRefBareLink_Pattern);
if (mOrbitRefBareLink.Success)
{
string RefWrapped = mOrbitRefBareLink.Groups[1].Value + @"<ref>" + mOrbitRefBareLink.Groups[2].Value + @"</ref>";
ArticleText = ArticleText.Replace(mOrbitRefBareLink.Value, RefWrapped);
}
// replace any unnamed refs with "ssd.jpl.nasa.gov/sbdb.cgi?sstr=" or "{{Cite SBDB" with JPL master or slave
int Named = 0;
string UnnamedJPLRef_Pattern = @"\<ref\>(\s*\{\{\s*Cite SBDB|[^\<]+?ssd\.jpl\.nasa\.gov/sbdb\.cgi\?sstr\=)[^\<]+?\</\s*ref\>";
foreach (Match m in Regex.Matches(ArticleText, UnnamedJPLRef_Pattern, RegexOptions.IgnoreCase))
{
Named++;
JPLDataRef_MasterOrSlave = (jplDataMasterExists) ? JPLDataRef_Slave : JPLDataRef_Master;
ArticleText = ArticleText.Replace(m.Value, JPLDataRef_MasterOrSlave);
jplDataMasterExists = true;
}
if (Named > 0)
{
string Plural = (Named > 1) ? "s" : "";
Summary += @"; master/slave'd " + Named + " [ssd.jpl.nasa.gov/sbdb] ref" + Plural;
}
// replace any unnamed refs with "ftp.lowell.edu...astorb" with Lowell master or slave
Named = 0;
string UnnamedLowellRef_Pattern = @"\<ref\>[^\<]+?ftp\.lowell\.edu[^\<]+?astorb[^\<]+?\</\s*ref\>";
foreach (Match m in Regex.Matches(ArticleText, UnnamedLowellRef_Pattern))
{
Named++;
LowellRef_MasterOrSlave = (lowellMasterExists) ? LowellRef_Slave : LowellRef_Master;
ArticleText = ArticleText.Replace(m.Value, LowellRef_MasterOrSlave);
lowellMasterExists = true;
}
if (Named > 0)
{
string Plural = (Named > 1) ? "s" : "";
Summary += @"; master/slave'd " + Named + " [ftp.lowell.edu] ref" + Plural;
}
// Preparatory infoboxes formatting fixes ///////////////////////////////////
// now we're ready to re-isolate {{Infobox planet}} for making many, many changes exclusive to Infobox text
Match mInfoboxText2 = Regex.Match(ArticleText, InfoboxText_Pattern, RegexOptions.IgnoreCase | RegexOptions.Singleline);
if (mInfoboxText2.Success && !Skip)
{
// make sure infobox isolation didn't grab too much
string ArticleTitle_Pattern = @"[\r\n]+\'\'\'(" + ArticleTitle + @"|(\(\d+\) )?\{\{\s*mp\s*\|[^\}]+\}\})\'\'\'"; // typically designates the start of body-text
Match mArticleTitle = Regex.Match(mInfoboxText2.Value, ArticleTitle_Pattern, RegexOptions.IgnoreCase);
if (mArticleTitle.Success) Summary = @"{{Infobox}} doesn't end with ""}}"" on a separate line."; // skip
NewInfobox = OldInfobox = mInfoboxText2.Value;
}
else if (!mInfoboxText2.Success && !Skip)
{
Summary = @"{{Infobox}} doesn't end with ""}}"" on a separate line (2nd try).";
Skip = true;
}
// move end-of-line-pipes "|" down to the next line while removing extra "\r\n"s b/w "|" & param names
string EoLPipe_Pattern = @"( *\|) *" + "[\r\n]+" + @"( *[a-z_]{4,})";
Match mEoLPipe = Regex.Match(NewInfobox, EoLPipe_Pattern);
if (mEoLPipe.Success && !Skip)
NewInfobox = Regex.Replace(NewInfobox, EoLPipe_Pattern, "\n" + mEoLPipe.Groups[1].Value + mEoLPipe.Groups[2].Value);
// move any parameters on the "{{Infobox planet" line to the next line
string InlineParams_Pattern = @"(\{\{Infobox planet)( *\| *[^\r\n]+)";
Match mInlineParams = Regex.Match(NewInfobox, InlineParams_Pattern);
if (mInlineParams.Success && !Skip)
NewInfobox = Regex.Replace(NewInfobox, InlineParams_Pattern, @"$1" + "\n" + @"$2");
// split multiple in-line parameters from 1 line to multiple lines
string MultipleILPs_Pattern = @"((?<=[\r\n]) *\| *[a-z_]{4,} *= *[^\|\r\n\{]*?)(\[\[[^\]]+\]\])?(\{\{[^\{\}]+\}\})?([^\[\|\r\n\{]*?)(\[\[[^\]]+\]\])?( +\||\| +)(?![^\r\n]*\}\})"; // 6 grps
loop = true;
while (loop)
{
// Using "foreach (Match m in Regex.Matches(NewInfobox, MultipleILPs_Pattern))"
// with "(?<!\{\{[A-Za-z ]*?)" in the pattern will return only 1 match, even if there are more...
// Looping over matches in TempNewInfobox while operating exclusively on NewInfobox doesn't work either.
// Therefore, have to use a while-loop instead of a foreach. Anyone know why?!
// Makes me question the integrity of all my other foreach Regex.Matches......
Match mMultipleILPs = Regex.Match(NewInfobox, MultipleILPs_Pattern);
if (mMultipleILPs.Success) NewInfobox = Regex.Replace(NewInfobox, MultipleILPs_Pattern, @"$1$2$3$4$5" + "\n" + @"$6");
else loop = false;
}
// move text (usually trailing units) from between final <ref>/{{template}} and endofline to before final <ref>/{{templ}} (can't look past multiline {{cite}}s) (5 groups)
string TrailingText_Pattern = @"( *\| *(?!discovery|alt_names)[a-z_]+ *= *[^\r\n]*)( ?\<| ?\{\{(?!(?:e|·|Gr)[\|\}]))([^\r\n]+)(\}\}\</ref\>|(?<!br */?|sup)\>|\}\})([^\{\}\r\n]+?)" + "[\r\n]+";
Match mTrailingText = Regex.Match(NewInfobox, TrailingText_Pattern, RegexOptions.IgnoreCase);
if (mTrailingText.Success)
{
string TrailingText = mTrailingText.Groups[5].Value;
if (TrailingText.Length > 0)
{
// instead of incorporating this into the initital pattern, separating it here for clarity, anticipating add'l exceptions
int Exception_Count = Regex.Matches(TrailingText, @"\</?ref", RegexOptions.IgnoreCase).Count;
if (Exception_Count == 0) NewInfobox = Regex.Replace(NewInfobox, TrailingText_Pattern, @"$1$5$2$3$4" + "\n", RegexOptions.IgnoreCase);
else // try again, slightly stricter regex this time
{
TrailingText_Pattern = @"( *\| *(?!discovery|alt_names)[a-z_]+ *= *[^\r\n]*)( ?\<| ?\{\{(?!(?:e|·|Gr)[\|\}]))([^\r\n]+)(\}\}\</ref\>|(?<!br */?|sup)\>|\}\})([^\<\>\{\}\r\n]+?)" + "[\r\n]+";
mTrailingText = Regex.Match(NewInfobox, TrailingText_Pattern, RegexOptions.IgnoreCase);
if (mTrailingText.Success)
if (mTrailingText.Groups[5].Value.Length > 0)
NewInfobox = Regex.Replace(NewInfobox, TrailingText_Pattern, @"$1$5$2$3$4" + "\n", RegexOptions.IgnoreCase);
}
}
}
// Orbital Elements Table -> Infobox ////////////////////////////////////////
// declare Orbital Elements regionals
Dictionary<string, int> OEWPSigFigs = new Dictionary<string, int>();
Dictionary<string, int> OEJPLSigFigs = new Dictionary<string, int>();
Dictionary<string, decimal> OEValues = new Dictionary<string, decimal>();
Dictionary<string, string> OEValStr = new Dictionary<string, string>();
Dictionary<string, string> OEErrors = new Dictionary<string, string>();
Dictionary<string, string> OEPatterns = new Dictionary<string, string>();
Dictionary<string, string> OEUnits = new Dictionary<string, string>();
Dictionary<string, string> OEUnitsExpected = new Dictionary<string, string>();
Dictionary<string, string> OESymbols = new Dictionary<string, string>();
// isolate the "Orbital Elements" table & save components
string OrbitalElements_Pattern = @"Orbital\s+Elements\s+at\s+Epoch.+Orbit\s+Determination\s+Parameters";
Match mOrbitalElements = Regex.Match(JPL_ExternalText, OrbitalElements_Pattern, RegexOptions.Singleline);
if (mOrbitalElements.Success && !Skip)
{
string OrbitalElements = mOrbitalElements.Value;
bool isJ2000 = Regex.Match(OrbitalElements, "J2000").Success;
if (isJ2000)
{
OriginalSummary = Summary; // use summary to catch exceptions and trigger a skip condition for OE units only
// find & store Epoch & JD
string EpochJD, JD;
EpochJD = JD = "";
Match mJD = Regex.Match(OrbitalElements, @"Epoch ([\d\.]+)"); // JPL mixes up JD & Epoch
if (mJD.Success)
{
JD = mJD.Groups[1].Value;
Match mEpoch = Regex.Match(OrbitalElements, @"Epoch [\d\.]+ \(([^\)]+)\)");
if (mEpoch.Success)
{
// format Epoch from YMD to DMY
string EpochRaw = mEpoch.Groups[1].Value;
string EpochYear, EpochMonth, EpochDay;
EpochYear = EpochMonth = EpochDay = "";
string[] EpochRawSplit = Regex.Split(EpochRaw, "-");
foreach (string val in EpochRawSplit)
{
if (Regex.Match(val, @"\d\d\d\d").Success) EpochYear = val;
else if (Regex.Match(val, @"[A-Za-z]+").Success) EpochMonth = val;
else if (Regex.Match(val, @"[\d\.]+").Success) EpochDay = val;
}
// 01.0 -> 1 (I have yet to see any non-integer values used for EpochDay)
EpochDay = EpochDay.Trim('0').Trim('.'); // will ammend if/as necessary
switch (EpochMonth)
{
case "Jan": EpochMonth="January"; break;
case "Feb": EpochMonth="February"; break;
case "Mar": EpochMonth="March"; break;
case "Apr": EpochMonth="April"; break;
case "May": EpochMonth="May"; break;
case "June": // exists
case "Jun": EpochMonth="June"; break;
case "July": // exists
case "Jul": EpochMonth="July"; break;
case "Aug": EpochMonth="August"; break;
case "Sept": // exists?
case "Sep": EpochMonth="September"; break;
case "Oct": EpochMonth="October"; break;
case "Nov": EpochMonth="November"; break;
case "Dec": EpochMonth="December"; break;
default:
Summary = @"Month: '" + EpochMonth + "' not found!";
Skip = true;
break;
}
string EpochDMY = EpochDay + " " + EpochMonth + " " + EpochYear;
EpochJD = EpochDMY + @" ([[Julian day|JD]] " + JD + ")";
OEValStr.Add("epoch", EpochJD);
}
}
// prepare to find & store most other OE params
List<string> OEParameters = new List<string>(new string[]
{ "eccentricity", "aphelion", "perihelion", "semimajor", "inclination", "asc_node", "arg_peri", "mean_anomaly", "mean_motion" });
foreach (string param in OEParameters)
{
switch (param)
{
case "eccentricity": OEPatterns.Add(param, @"\>e\<.+?size\=.2.\>([\d\.Ee\+-]+)\<.+?size\=.2.\>([\d\.n/aEe\+-]+)()\<"); break;
case "aphelion": OEPatterns.Add(param, @"\>Q\<.+?size\=.2.\>([\d\.Ee\+-]+)\<.+?size\=.2.\>([\d\.n/aEe\+-]+)\<.+?size\=.2.\>([a-z/]+)\<"); break;
case "perihelion": OEPatterns.Add(param, @"\>q\<.+?size\=.2.\>([\d\.Ee\+-]+)\<.+?size\=.2.\>([\d\.n/aEe\+-]+)\<.+?size\=.2.\>([a-z/]+)\<"); break;
case "semimajor": OEPatterns.Add(param, @"\>a\<.+?size\=.2.\>([\d\.Ee\+-]+)\<.+?size\=.2.\>([\d\.n/aEe\+-]+)\<.+?size\=.2.\>([a-z/]+)\<"); break;
case "inclination": OEPatterns.Add(param, @"\>i\<.+?size\=.2.\>([\d\.Ee\+-]+)\<.+?size\=.2.\>([\d\.n/aEe\+-]+)\<.+?size\=.2.\>([a-z/]+)\<"); break;
case "asc_node": OEPatterns.Add(param, @"\>node\<.+?size\=.2.\>([\d\.Ee\+-]+)\<.+?size\=.2.\>([\d\.n/aEe\+-]+)\<.+?size\=.2.\>([a-z/]+)\<"); break;
case "arg_peri": OEPatterns.Add(param, @"\>peri\<.+?size\=.2.\>([\d\.Ee\+-]+)\<.+?size\=.2.\>([\d\.n/aEe\+-]+)\<.+?size\=.2.\>([a-z/]+)\<"); break;
case "mean_anomaly": OEPatterns.Add(param, @"\>M\<.+?size\=.2.\>([\d\.Ee\+-]+)\<.+?size\=.2.\>([\d\.n/aEe\+-]+)\<.+?size\=.2.\>([a-z/]+)\<"); break;
case "mean_motion": OEPatterns.Add(param, @"\>n\<.+?size\=.2.\>([\d\.Ee\+-]+)\<.+?size\=.2.\>([\d\.n/aEe\+-]+)\<.+?size\=.2.\>([a-z/]+)\<"); break;
}
switch (param) // separated for readability & editability
{
case "eccentricity": OEUnitsExpected.Add(param, ""); OESymbols.Add(param, @"\(e\)"); break;
case "aphelion": OEUnitsExpected.Add(param, "au"); OESymbols.Add(param, @"\(Q\)"); break;
case "perihelion": OEUnitsExpected.Add(param, "au"); OESymbols.Add(param, @"\(q\)"); break;
case "semimajor": OEUnitsExpected.Add(param, "au"); OESymbols.Add(param, @"\(a\)"); break;
case "inclination": OEUnitsExpected.Add(param, "deg"); OESymbols.Add(param, @"\(i\)"); break;
case "asc_node": OEUnitsExpected.Add(param, "deg"); OESymbols.Add(param, @"\(Ω\)"); break;
case "arg_peri": OEUnitsExpected.Add(param, "deg"); OESymbols.Add(param, @"\(ω\)"); break;
case "mean_anomaly": OEUnitsExpected.Add(param, "deg"); OESymbols.Add(param, @"\(M\)"); break;
case "mean_motion": OEUnitsExpected.Add(param, "deg/d"); OESymbols.Add(param, @"\(n\)"); break;
}
}
// find & store most other OE params
for (int i = 0; i < OEParameters.Count; i++)
{
Param = OEParameters[i];
string Pattern = OEPatterns[Param];
Match mPattern = Regex.Match(OrbitalElements, Pattern, RegexOptions.Singleline); // must remain case-sensitive!
if (mPattern.Success)
{
OEValStr.Add(Param, mPattern.Groups[1].Value);
OEErrors.Add(Param, mPattern.Groups[2].Value);
OEUnits.Add( Param, mPattern.Groups[3].Value);
if (OEUnits[ Param] != OEUnitsExpected[Param])
{
Summary += @" """ + OEUnitsExpected[Param] + @""" units expected for " + Param + @"; """ + OEUnits[Param] + @""" units received.";
OEParameters.Remove(Param);
i--;
}
}
else
{
OEParameters.Remove(Param);
i--;
}
}
// find & store "special" OE params
string Period_Pattern = @"\>period\<.+?size\=.2.\>([\d\.]+)\<br\>([\d\.]+)\<.+?size\=.2.\>([\d\.n/aEe\+-]+)\<br\>([\d\.n/aEe\+-]+)\<.+?size\=.2.\>([a-z]+)\<br\>([a-z]+)\<";
Match mPeriod = Regex.Match(OrbitalElements, Period_Pattern, RegexOptions.Singleline);
if (mPeriod.Success)
{
OEParameters.Add("PeriodYears"); // years must be added before days
OEParameters.Add("PeriodDays"); // days must be added after years
OEValStr.Add("PeriodDays", mPeriod.Groups[1].Value);
OEValStr.Add("PeriodYears", mPeriod.Groups[2].Value);
OEErrors.Add("PeriodDays", mPeriod.Groups[3].Value);
OEErrors.Add("PeriodYears", mPeriod.Groups[4].Value);
OEUnits.Add( "PeriodDays", mPeriod.Groups[5].Value); // should be "d"
OEUnits.Add( "PeriodYears", mPeriod.Groups[6].Value); // should be "yr"
if (OEUnits[ "PeriodDays"] != "d") Summary += @" Day units expected for period; """ + OEUnits["PeriodDays"] + @""" units received.";
if (OEUnits[ "PeriodYears"] != "yr") Summary += @" Yr units expected for period; """ + OEUnits["PeriodYears"] + @""" units received.";
if (OEUnits[ "PeriodDays"] != "d" | OEUnits[ "PeriodYears"] != "yr")
{
OEParameters.Remove("PeriodDays");
OEParameters.Remove("PeriodYears");
}
}
// summary skip-condition if any OE units don't match
if (Summary != OriginalSummary)
{
if (skipEntirePageIfAnyJPLUnitsDontMatchExpected) Skip = true;
else Summary = OriginalSummary;
}
// Find & store # of sig figs used in infobox for existing parameters. Round up to 5 (similar to JPL's error value precision).
int j = 0;
int SumSigFigs = 0;
double AvgSigFigs = 0;
foreach (string param in OEParameters)
{
j++;
int SigFigs = 0;
string Figs = "";
string FindSigFigs_Pattern = @" *\|\s*" + param + @"\s*=\s*(?:\{\{\s*(?:val|convert)\s*\|\s*)?(\d[\d\.,]*|\.\d+)";
Match mFindSigFigs = Regex.Match(NewInfobox, FindSigFigs_Pattern, RegexOptions.IgnoreCase);
if (mFindSigFigs.Success && param != "PeriodDays" && param != "PeriodYears")
Figs = mFindSigFigs.Groups[1].Value;
// handle period params separately; have to improvise and use JPL's values since:
// 1) there's no "period_days" parameter, just "period", and
// 2) someone might've put D before Y or vice versa, so can't trust the 1st number in the |period= parameter
if (param == "PeriodYears") Figs = OEValStr["PeriodYears"];
if (param == "PeriodDays")
{
// PeriodDays sigfigs = WhichEverIsGreater(JPL # of integer figs, PeriodYears sigfigs + 2)
Figs = Regex.Replace(OEValStr["PeriodDays"], @"(\d+)\.\d*", @"$1");
int IntegerFigs = Figs.Length;
int AdditionalFigs = (IntegerFigs < 5) ? 2 : 1; // don't want to add too many
int PeriodYearsFigs = OEWPSigFigs["PeriodYears"] + AdditionalFigs; // "PeriodYears" must be evaluated before "PeriodDays"
int PeriodDaysFigs = (IntegerFigs < PeriodYearsFigs) ? PeriodYearsFigs : IntegerFigs;
Figs = new String('9', PeriodDaysFigs);
}
Figs = Figs.Replace(".", "").Replace(",", "").Trim().TrimStart('0');
if (param == "PeriodYears" | param == "PeriodDays")
{
j--; // PeriodYears is an outlier when it comes to precision; typically only 3 sigfigs,
SigFigs = Figs.Length; // and PeriodDays is wildly over-precise, so ignore both JPL and WP.
}
else if (param != "PeriodYears" && param != "PeriodDays")
{
SigFigs = (Figs.Length < 5) ? 5 : Figs.Length; // assume this is good enough (0th iteration)
// If a parameter value doesn't exist, like semimajor for 145534 Jhongda, and all other existing values are >5 sigfigs long,
// then it looks better and seems more appropriate that semimajor have >5 sigfigs instead of 5, if possible.
SumSigFigs = SumSigFigs + SigFigs;
AvgSigFigs = Convert.ToDouble(SumSigFigs) / Convert.ToDouble(j);
AvgSigFigs = Math.Round(AvgSigFigs); // use this rounded value as the final
if (SigFigs < AvgSigFigs)
{
SumSigFigs = SumSigFigs - SigFigs + Convert.ToInt32(AvgSigFigs); // remove the assumed value, add the final
SigFigs = Convert.ToInt32(AvgSigFigs);
}
}
OEWPSigFigs.Add(param, SigFigs);
bool debug1 = false;
if (debug1)
{
Summary += param + ": " + j + "," + Figs + "," + Figs.Length + "," + SumSigFigs.ToString() + "," + AvgSigFigs.ToString() + "," + SigFigs.ToString() + "| ";
Skip = true;
}
}
// convert OEValStr from strings to decimals & round to 5 or more significant figures
foreach (string param in OEParameters)
{
// find & store JPL's sigfigs, for comparison later
string ValStr = OEValStr[param];
ValStr = ValStr.Replace(".","").Replace(",","").Trim().TrimStart('0');
OEJPLSigFigs.Add(param, ValStr.Length);
// round roundable params (eccentricity, aphelion, semimajor, perihelion, inclination, asc_node, arg_peri, mean_anomaly, mean_motion, PeriodDays, PeriodYears)
// double doesn't treat trailing decimal 0's as sigfigs (342843 Davidbowie's i should be 2.7680), but decimal DOES
double OEValue = Convert.ToDouble(OEValStr[param]);
decimal Scale = Convert.ToDecimal(Math.Pow(10, Math.Floor(Math.Log10(Math.Abs(OEValue))) + 1));
decimal OERoundedValue = Scale * Math.Round(Convert.ToDecimal(OEValue) / Scale, OEWPSigFigs[param]); // decimal=6 sigfigs, double=5 sigfigs here, for some reason
OEValues.Add(param, OERoundedValue);
// Remove the UNnecessary trailing zeros (some might be significant).
// Excessive trailing zeros effects some values, but not all. Not sure why...and I wish there was a cleaner way to fix this...
string val = OEValues[param].ToString();
string valnp = val.Replace(".", "").Trim();
if (valnp.Length > OEWPSigFigs[param])
{
int Excess = valnp.Length - OEWPSigFigs[param];
string Excess0_Pattern = @"0{" + Excess + "}$";
val = Regex.Replace(val, Excess0_Pattern, "");
OEValStr[param] = val.Trim('.');
}
// Pad with significant trailing zeros IIF value is currently below the sigfig threshold
// AND JPL contains at least that threshold number of sigfigs.
valnp = val.Replace(".", "").TrimStart('0');
if (valnp.Length < OEWPSigFigs[param] && OEWPSigFigs[param] <= OEJPLSigFigs[param])
{
// This happens if there are trailing significant 0's in the JPL value, which then get removed even if below the 5-digit cutoff,
// probably b/c I *HAVE TO* make "OEValue" (above) a double, b/c some Math.functions can't accept decimals. It's the 21st century, C#, wtf.
int Deficit = OEWPSigFigs[param] - valnp.Length;
string Deficit0s = new String('0', Deficit);
val += Deficit0s;
OEValStr[param] = val.Trim('.');
}
}
// determine if OE symbols are being used in existing params (& remove backslashes)
foreach (string param in OEParameters)
{
if (param.IndexOf("Period") != -1) continue; // no symbols for PeriodDays & PeriodYears
string Symbol_Pattern = @"\|\s*" + param + @"\s*=[^\r\n]*" + OESymbols[param];
Match mSymbolExists = Regex.Match(NewInfobox, Symbol_Pattern); // need escapes in symbols here
OESymbols[param] = OESymbols[param].Replace(@"\", ""); // escapes no longer needed
if (mSymbolExists.Success) areSymbolsUsed = true;
}
// determine the prevailing infobox spacing scheme ////////////////////
// 4 spacing values: <S1>|<S2><param><S3>=<S4><paramvalue>
List<int> S12List = new List<int>();
List<int> S3List = new List<int>();
List<int> S4List = new List<int>();
List<int> EqualAlignToList = new List<int>(); // stores the relative pos of "=" for each param found
// name/mp_name/width/background are/should be excluded from search b/c they're more frequently malformed than other params
List<string> SearchList = AlmostAllParamList; // choose either UpdatableParamList (~27) or AlmostAllParamList (~107)
foreach (string param in SearchList)
{
string Pattern = @"(?<!efn)( *)\|( *)" + param + @"( *)\=( *)([^\r\n\|]{4,})?"; // want non-empty 1-line params
Match mPattern = Regex.Match(NewInfobox, Pattern);
if (mPattern.Success)
{
S1 = mPattern.Groups[1].Value + " "; // >= 1 space needed so there are no leading zeros, which might be trucated.
S2 = mPattern.Groups[2].Value + " "; // Will subtract 1 from these values later.
string S12_temp = S1.Length.ToString() + S2.Length.ToString();
S12List.Add(Convert.ToInt32(S12_temp));
// handle S3 & EqualAlignTo separately since they're related & vary differently than S12
S3 = mPattern.Groups[3].Value;
if (S3.Length > 2 | param.Length > 10) EqualAlignTo = param.Length + S3.Length;
else EqualAlignTo = 0; // pos of "=" relative to start of parameter name
S3List.Add(S3.Length);
EqualAlignToList.Add(EqualAlignTo);
// handle S4 separately, only for non-empty params, since empty params only count towards S123
S4 = mPattern.Groups[4].Value;
if (mPattern.Groups[5].Value.Trim().Length >= 4 | S4.Length > 0)
{
if (S4.Length == 2) S4 = " "; // "= " is weird, and probably a typo too
S4List.Add(S4.Length);
}
else S4 = "";
//NewInfobox += " " + param + ":" + S12_temp + S3.Length.ToString() + S4.Length.ToString() + @"+" + EqualAlignTo.ToString(); // debugging
}
}
// figure out which values are the most common
int MostCommonS12 = (from i in S12List // stackoverflow.com ftw!
group i by i into grp
orderby grp.Count() descending
select grp.Key).First();
int MostCommonS3 = (from i in S3List
group i by i into grp
orderby grp.Count() descending
select grp.Key).First();
int MostCommonS4 = (from i in S4List
group i by i into grp
orderby grp.Count() descending
select grp.Key).First();
EqualAlignTo = (from i in EqualAlignToList
group i by i into grp
orderby grp.Count() descending
select grp.Key).First();
// rule out any alignment IIF EqualAlignTo frequency < 40% AND MostCommonS3 < 3
int EqualAlignTo_Count = ((from temp in EqualAlignToList where temp.Equals(EqualAlignTo) select temp).Count());
double EqualAlignTo_Fraction = Convert.ToDouble(EqualAlignTo_Count) / Convert.ToDouble(EqualAlignToList.Count);
if (EqualAlignTo_Fraction < 0.4 && MostCommonS3 < 3) EqualAlignTo = 0;
// store remaining spacing scheme
string SpacingIndex = MostCommonS12.ToString();
int MostCommonS1 = Convert.ToInt32(SpacingIndex.Substring(0, 1)) - 1;
int MostCommonS2 = Convert.ToInt32(SpacingIndex.Substring(1, 1)) - 1;
S1 = new String(' ', MostCommonS1);
S2 = new String(' ', MostCommonS2);
S4 = new String(' ', MostCommonS4);
if (MostCommonS3 == 2) S3 = " "; // 2 spaces is most likely a bug/typo for 1 space, if ubiquitous
else S3 = new String(' ', MostCommonS3);
S12 = S1 + "|" + S2;
bool debug2 = false;
if (debug2)
{
NewInfobox += "\n Most common S1, 2, 3, 4 lengths:" + MostCommonS1.ToString() + ", " +
MostCommonS2.ToString() + ", " + MostCommonS3.ToString() + ", " + MostCommonS4.ToString();
NewInfobox += "\n EqualAlignTo length:" + EqualAlignTo.ToString();
NewInfobox += "\n EqualAlignTo_Count: " + EqualAlignTo_Count.ToString();
NewInfobox += "\n EqualAlignTo_Fraction:" + EqualAlignTo_Fraction.ToString();
}
// add/update {{Infobox}} OE params ///////////////////////////////////
Param = "discovered";
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= Param.Length) ? new String(' ', EqualAlignTo - Param.Length) : "";
string OldParam = @" *\|\s*" + Param + @"\s*=[^\r\n\<]*";
string NewUnits = "";
string NewParam = S12 + Param + S3 + "=" + S4 + JPL_DDate;
Match mParam = Regex.Match(NewInfobox, OldParam);
if ( mParam.Success && !string.IsNullOrEmpty(JPL_DDate)) NewInfobox = Regex.Replace(NewInfobox, OldParam, NewParam);
if (!mParam.Success && !string.IsNullOrEmpty(JPL_DDate)) NewInfobox = rAppendToInfobox.Replace(NewInfobox, @"$1" + NewParam + "\n" + @"$2");
Param = "minorplanet"; // insert @ top
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= Param.Length) ? new String(' ', EqualAlignTo - Param.Length) : "";
OldParam = @" *\|\s*" + Param + @"\s*=[^\r\n\<]*";
NewParam = S12 + Param + S3 + "=" + S4 + "yes";
string AppendParam = @"{{Infobox planet" + "\n" + NewParam;
mParam = Regex.Match(NewInfobox, OldParam);
if ( mParam.Success) NewInfobox = Regex.Replace(NewInfobox, OldParam, NewParam);
if (!mParam.Success) NewInfobox = Regex.Replace(NewInfobox, @"{{Infobox planet", AppendParam);
// remove deprecated OE parameters, unless they end with "<!--" (i.e. [[6398 Timhunter]])
List<string> DeprecatedParams = new List<string>(new string[]
{ "physical_characteristics", "orbital_characteristics", "designations", "discovery", "width" , "diameter" });
int Deleted = 0;
foreach (string param in DeprecatedParams)
{
string DeleteCandidate_Pattern = @"[\r\n]+ *\|\s*" + param + @"\s*=[^\r\n]*";
Match mDeleteCandidate = Regex.Match(NewInfobox, DeleteCandidate_Pattern, RegexOptions.IgnoreCase);
if (mDeleteCandidate.Success)
{
string CommentElement_Pattern = @"(\<\!\-\-|\-\-\>)";
Match mCommentElement = Regex.Match(mDeleteCandidate.Value, CommentElement_Pattern);
string MultilineComment_Pattern = @"\<\!\-\-[ ]*" + Environment.NewLine;
Match mMultilineComment = Regex.Match(mDeleteCandidate.Value, MultilineComment_Pattern);
string CommentSelfContained_Pattern = @"(\<\!\-\-[^\r\n]*\-\-\>)";
Match mCommentSelfContained = Regex.Match(mDeleteCandidate.Value, CommentSelfContained_Pattern);
bool okDelete1 = !mCommentElement.Success;
bool okDelete2 = (mCommentElement.Success && mCommentSelfContained.Success && !mMultilineComment.Success);
if (okDelete1 | okDelete2)
{
Deleted++;
NewInfobox = Regex.Replace(NewInfobox, DeleteCandidate_Pattern, "", RegexOptions.IgnoreCase);
}
}
}
if (Deleted > 0)
{
string Plural = (Deleted > 1) ? "s" : "";
Summary += @"; remove " + Deleted.ToString() + @" deprecated parameter" + Plural;
}
// bgcolour (deprecated) -> background
Param = @"bgcolu?ou?r";
OldParam = @" *\|\s*" + Param + @"\s*= *([^\r\n\<]*)";
Param = "background";
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= Param.Length) ? new String(' ', EqualAlignTo - Param.Length) : "";
NewParam = S12 + Param + S3 + "=" + S4 + @"${1}"; // don't modify bg color (see [[User:Rfassbind/sandbox/color-scheme]])
mParam = Regex.Match(NewInfobox, OldParam);
if (mParam.Success) NewInfobox = Regex.Replace(NewInfobox, OldParam, NewParam);
// |background= handling
Param = "background";
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= Param.Length) ? new String(' ', EqualAlignTo - Param.Length) : "";
OldParam = @" *\|\s*" + Param + @"\s*=[^\r\n\<]*";
NewParam = S12 + Param + S3 + "=" + S4 + @"#FFFFC0";
mParam = Regex.Match(NewInfobox, OldParam);
if (mParam.Success)
{
// if background != "#<3-6-digit hexidecimal>", make background=#FFFFC0
string OldParamColor_Pattern = @"(#[0-9A-Fa-f]{3,6})(?![0-9A-Fa-f])";
Match mOldParamColor = Regex.Match(mParam.Value, OldParamColor_Pattern);
if (!mOldParamColor.Success) NewInfobox = Regex.Replace(NewInfobox, OldParam, NewParam);
else ; // do nothing: don't modify bg color, see [[User:Rfassbind/sandbox/color-scheme]] instead
}
if (!mParam.Success)
{
// insert "background=#FFFFC0" after |minorplanet=yes if background DNE
NewInfobox = Regex.Replace(NewInfobox, @" *\|\s*minorplanet\s*=[^\r\n\<]*", @"$0" + "\n" + NewParam);
}
Param = "epoch";
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= Param.Length) ? new String(' ', EqualAlignTo - Param.Length) : "";
OldParam = @" *\|\s*" + Param + @"\s*=[^\r\n\<]*";
NewParam = S12 + Param + S3 + "=" + S4 + OEValStr[Param];
mParam = Regex.Match(NewInfobox, OldParam);
if ( mParam.Success) NewInfobox = Regex.Replace(NewInfobox, OldParam, NewParam);
if (!mParam.Success) NewInfobox = rAppendToInfobox.Replace(NewInfobox, @"$1" + NewParam + "\n" + @"$2");
foreach (string param in OEParameters)
{
// evaluate error (display if > 1)
double Error = -1d;
Match mIsNum = Regex.Match(OEErrors[param].Trim(), @"^[\d\.Ee\+\-,]+$");
if (mIsNum.Success) Error = Convert.ToDouble(OEErrors[param].Trim());
// evaluate suffix
double AU = 0d; // convert to Gm or Tm?
if (param == "aphelion" | param == "semimajor" | param == "perihelion") AU = Convert.ToDouble(OEValStr[param].Trim());
string ConvertSuffix = (AU < 6.68459d) ? @"Gm|abbr=on" : @"Tm|abbr=on";
if (param == "mean_motion") // (n) handle separately from other OEParams
{
// evaluate mm value
double MM = Convert.ToDouble(OEValStr[param].Trim());
if (MM < 1d) OEValStr[param] = @"{{Deg2DMS|" + OEValStr[param] + @"|sup=ms}}";
// evaluate mm error (display if > MM/2)
if (Error >= (MM/2d))
{
if (Error < 1d) OEValStr[param] += @" ± {{Deg2DMS|" + OEErrors[param] + @"|sup=ms}}";
if (Error >= 1d) OEValStr[param] += @" ± " + OEErrors[param];
}
// evaluate mm units
if (MM < 1d | (Error >= (MM/2d) && Error < 1d)) OEUnits[param] = @" / day";
else OEUnits[param] = @"°/day";
}
// reuse OEUnits & OEValStr lists to customize each value displayed
switch (param)
{
// {{Convert|1.23456|AU|Gm/Tm}} params
case "aphelion": // (Q)
if (Error >= 1d) OEValStr[param] = @"{{Convert|" + OEValStr[param] + "|±|" + OEErrors[param] + "|" + OEUnits[param].ToUpper() + "|" + ConvertSuffix + @"|lk=on}}";
if (Error < 1d) OEValStr[param] = @"{{Convert|" + OEValStr[param] + "|" + OEUnits[param].ToUpper() + "|" + ConvertSuffix + @"|lk=on}}";
OEUnits[ param] = @""; break;
case "semimajor": // (a)
if (Error >= 1d) OEValStr[param] = @"{{Convert|" + OEValStr[param] + "|±|" + OEErrors[param] + "|" + OEUnits[param].ToUpper() + "|" + ConvertSuffix + @"}}";
if (Error < 1d) OEValStr[param] = @"{{Convert|" + OEValStr[param] + "|" + OEUnits[param].ToUpper() + "|" + ConvertSuffix + @"}}";
OEUnits[ param] = @""; break;
case "perihelion": // (q)
if (Error >= 1d) OEValStr[param] = @"{{Convert|" + OEValStr[param] + "|±|" + OEErrors[param] + "|" + OEUnits[param].ToUpper() + "|" + ConvertSuffix + @"}}";
if (Error < 1d) OEValStr[param] = @"{{Convert|" + OEValStr[param] + "|" + OEUnits[param].ToUpper() + "|" + ConvertSuffix + @"}}";
OEUnits[ param] = @""; break;
case "PeriodDays":
case "PeriodYears": continue; // skip these & handle them after the foreach loop
}
// {{val}} params & (conditional) units based on large errors (separated for readability & editability) (i.e. 2014 MT69)
switch (param)
{
case "eccentricity": // (e)
if (Error >= 1d) OEValStr[param] = @"{{val|" + OEValStr[param] + "|" + OEErrors[param] + @"}}";
OEUnits[param] = @""; break;
case "inclination": // (i)
case "asc_node": // (node)
case "arg_peri": // (peri)
if (Error >= 1d)
{
OEValStr[param] = @"{{val|" + OEValStr[param] + "|" + OEErrors[param] + @"|u=°}}";
OEUnits[param] = @"";
}
else OEUnits[param] = @"°";
break;
case "mean_anomaly": // (M)
if (Error >= Convert.ToDouble(OEValStr[param])) OEValStr[param] = @"{{val|" + OEValStr[param] + "|" + OEErrors[param] + @"}}";
OEUnits[param] = @"[[Degree (angle)|°]]"; break; // unit's first use in the rendered infobox
}
// add/append OE parameters & values
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= param.Length) ? new String(' ', EqualAlignTo - param.Length) : "";
OldParam = @" *\|\s*" + param + @"\s*=[^\r\n\<]*";
string NewSymbol = (areSymbolsUsed) ? " " + OESymbols[param] : "";
NewParam = S12 + param + S3 + "=" + S4 + OEValStr[param] + OEUnits[param] + NewSymbol;
mParam = Regex.Match(NewInfobox, OldParam);
if ( mParam.Success) NewInfobox = Regex.Replace(NewInfobox, OldParam, NewParam);
if (!mParam.Success) NewInfobox = rAppendToInfobox.Replace(NewInfobox, @"$1" + NewParam + "\n" + @"$2");
} // end foreach (string param in OEParameters)
Param = "period";
if (OEValStr.ContainsKey("PeriodYears"))
{
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= Param.Length) ? new String(' ', EqualAlignTo - Param.Length) : "";
OldParam = @" *\|\s*" + Param + @"\s*=[^\r\n\<]*";
string NewUnits_yr = @" [[Julian year (astronomy)|yr]]";
string NewUnits_d = @" [[Julian year (astronomy)|d]]"; // best description of a Julian day, as used here, is at the top of [[Julian year]]
// build NewParam & handle large errors (don't include days if error >= period/2)
double PYears = Convert.ToDouble(OEValStr["PeriodYears"]);
double PYearsErr = Convert.ToDouble(OEErrors["PeriodYears"]);
if (PYearsErr >= (PYears/2d))
NewParam = S12 + Param + S3 + "=" + S4 + "{{val|" + PYears.ToString() + "|" + PYearsErr.ToString() + @"}}" + NewUnits_yr;
else NewParam = S12 + Param + S3 + "=" + S4 + OEValStr["PeriodYears"] + NewUnits_yr + " (" + OEValStr["PeriodDays"] + NewUnits_d + ")";
// add/update period
mParam = Regex.Match(NewInfobox, OldParam);
if ( mParam.Success) NewInfobox = Regex.Replace(NewInfobox, OldParam, NewParam);
if (!mParam.Success) NewInfobox = rAppendToInfobox.Replace(NewInfobox, @"$1" + NewParam + "\n" + @"$2");
}
// figure out which ref to use in |orbit_ref=, if/when needed
JPLDataRef_MasterOrSlave = (jplDataMasterExists) ? JPLDataRef_Slave : JPLDataRef_Master;
// make sure that "|orbit_ref=" contains a ref with "jpl" in the name
Param = "orbit_ref";
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= Param.Length) ? new String(' ', EqualAlignTo - Param.Length) : "";
string OrbitRef_Pattern = @" *\|\s*orbit_ref\s*= *([^\r\n]*(?:\{\{cit[ae][ t][^\}]+\}\}(?:\s*\</ref\>)?)?)"; // "?" in case not on a sep line & Environment.NewLine doesn't work here for some reason
Match mOrbitRef = Regex.Match(NewInfobox, OrbitRef_Pattern, RegexOptions.IgnoreCase);
if (mOrbitRef.Success) // "orbit_ref" exists, but it could either be 1) empty, 2) have a JPL ref, or 3) have no JPL refs
{
string OrbitRef = mOrbitRef.Groups[1].Value;
if (!string.IsNullOrEmpty(OrbitRef)) // not empty
{
string OrbitRefRef_Pattern = @" *\|\s*orbit_ref\s*=\s*.*?\<ref";
Match mOrbitRefRef = Regex.Match(NewInfobox, OrbitRefRef_Pattern, RegexOptions.IgnoreCase);
if (mOrbitRefRef.Success) // orbit_ref has a ref
{
string OrbitRefJPLRef_Pattern = @" *\|\s*orbit_ref\s*=\s*(.*?\<ref name\s*=\s*[^\r\n\>/]*?(?:" + JPLRefNameVariants + @")[^\r\n\>]*/?\>[^\r\n]*?)";
Match mOrbitRefJPLRef = Regex.Match(NewInfobox, OrbitRefJPLRef_Pattern, RegexOptions.IgnoreCase);
if ( mOrbitRefJPLRef.Success); // it's jpl; do nothing
if (!mOrbitRefJPLRef.Success) // it's not jpl; append jplref
{
NewParam = S12 + Param + S3 + "=" + S4 + @"$1" + JPLDataRef_MasterOrSlave;// + "\n";
NewInfobox = Regex.Replace(NewInfobox, OrbitRef_Pattern, NewParam, RegexOptions.IgnoreCase);
if (!jplDataMasterExists) Summary += "; +jpldata master ref to orbit_ref";
jplDataMasterExists = true;
}
}
if (!mOrbitRefRef.Success) // orbit_ref has no refs; replace text with jplref
{
NewParam = S12 + Param + S3 + "=" + S4 + ThinspRefs + JPLDataRef_MasterOrSlave;// + "\n";
NewInfobox = Regex.Replace(NewInfobox, OrbitRef_Pattern, NewParam, RegexOptions.IgnoreCase);
if (!jplDataMasterExists) Summary += "; +jpldata master ref to orbit_ref";
jplDataMasterExists = true;
}
}
if (string.IsNullOrEmpty(OrbitRef)) // orbit_ref is empty; add jplref
{
NewParam = S12 + Param + S3 + "=" + S4 + ThinspRefs + JPLDataRef_MasterOrSlave;// + "\n";
NewInfobox = Regex.Replace(NewInfobox, OrbitRef_Pattern, NewParam, RegexOptions.IgnoreCase);
if (!jplDataMasterExists) Summary += "; +jpldata master ref to orbit_ref";
jplDataMasterExists = true;
}
}
else if (!mOrbitRef.Success) // "orbit_ref" DNE; append Param + ref to infobox
{
NewParam = S12 + Param + S3 + "=" + S4 + ThinspRefs + JPLDataRef_MasterOrSlave;
NewInfobox = rAppendToInfobox.Replace(NewInfobox, @"$1" + NewParam + "\n" + @"$2");
if (!jplDataMasterExists) Summary += "; +jpldata master ref to orbit_ref";
jplDataMasterExists = true;
}
bool debug3 = false;
if (debug3)
{
// see http://en.wiki.x.io/w/index.php?title=User:Tom.Reding/JPL_to_Infobox_planet&oldid=718933095
// for OEParameters, OEValStr, & OEWPSigFigs debugging display
}
}
} // end Orbital Elements Table
// Orbit Determination Parameters -> Infobox ////////////////////////////////
// isolate the "Orbit Determination Parameters" table & save components
string ODParameters_Pattern = @"Orbit\s+Determination\s+Parameters.+?\</table\>";
Match mODParameters = Regex.Match(JPL_ExternalText, ODParameters_Pattern, RegexOptions.Singleline);
if (mODParameters.Success && !Skip)
{
string ODParameters = mODParameters.Value;
Param = "observation_arc";
string ObsArc_Pattern = @"\>data\-arc span\<.+?size\=""2""\> (([\d\.,]+) days \(([\d\.]+) yr\)) \<";
Match mObsArc = Regex.Match(ODParameters, ObsArc_Pattern, RegexOptions.Singleline); // WL obsarc 1st, period 2nd
if (mObsArc.Success)
{
string ObsArc = mObsArc.Groups[1].Value;
string ObsDays = mObsArc.Groups[2].Value.Trim('.');
string ObsYears = mObsArc.Groups[3].Value.Trim('.');
// if >= 365 days, format: "years (days)", otherwise default to JPL's "days (years)"
int iObsDays = Convert.ToInt32(ObsDays);
if (iObsDays >= 365) ObsArc = ObsYears + @" yr (" + ObsDays + @" d)";
else ObsArc = Regex.Replace(ObsArc, "days?", "d", RegexOptions.IgnoreCase);
// add to/update infobox
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= Param.Length) ? new String(' ', EqualAlignTo - Param.Length) : "";
string OldParam = @" *\|\s*" + Param + @"\s*=[^\r\n\<]*";
string NewParam = S12 + Param + S3 + "=" + S4 + ObsArc;
Match mParam = Regex.Match(NewInfobox, OldParam);
if ( mParam.Success) NewInfobox = Regex.Replace(NewInfobox, OldParam, NewParam);
if (!mParam.Success) NewInfobox = rAppendToInfobox.Replace(NewInfobox, @"$1" + NewParam + "\n" + @"$2");
}
Param = "uncertainty";
string CCode_Pattern = @"\>condition\s+code\<.+?size\=""2""\> (\d) \<";
Match mCCode = Regex.Match(ODParameters, CCode_Pattern, RegexOptions.Singleline);
if (mCCode.Success)
{
string CCode = mCCode.Groups[1].Value;
// add to/update infobox
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= Param.Length) ? new String(' ', EqualAlignTo - Param.Length) : "";
string OldParam = @" *\|\s*" + Param + @"\s*=[^\r\n\<]*";
string NewParam = S12 + Param + S3 + "=" + S4 + CCode;
Match mParam = Regex.Match(NewInfobox, OldParam);
if ( mParam.Success) NewInfobox = Regex.Replace(NewInfobox, OldParam, NewParam);
if (!mParam.Success) NewInfobox = rAppendToInfobox.Replace(NewInfobox, @"$1" + NewParam + "\n" + @"$2");
}
} // end Orbit Determination Parameters
// Physical Parameters Table -> Infobox /////////////////////////////////////
// declare regionals
Dictionary<string, string> PPValues = new Dictionary<string, string>(); // raw-ish JPL values
Dictionary<string, string> PPValWP = new Dictionary<string, string>(); // "WP-ready" template-wrapped values
Dictionary<string, string> PPErrors = new Dictionary<string, string>(); // raw JPL values
Dictionary<string, string> PPErrWP = new Dictionary<string, string>(); // "WP-ready" values, depending on if consumed by {{Convert/val}}
Dictionary<string, string> PPPatterns = new Dictionary<string, string>(); // regex patterns needed to find values on JPL's website
Dictionary<string, string> PPUnits = new Dictionary<string, string>(); // raw JPL's units
Dictionary<string, string> PPUnitsWP = new Dictionary<string, string>(); // units to be displayed in {{Infobox}}
Dictionary<string, string> PPUnitsExpected = new Dictionary<string, string>();
// isolate the "Physical Parameters Table" & save components
string PhysicalParameters_Pattern = @"Physical\s+Parameter\s+Table.+?\</table\>";
Match mPhysicalParameters = Regex.Match(JPL_ExternalText, PhysicalParameters_Pattern, RegexOptions.Singleline);
if (mPhysicalParameters.Success && !Skip)
{
string PhysicalParameters = mPhysicalParameters.Value;
// prepare to find & store most Physical Parameters
List<string> PParameters = new List<string>(new string[] { "abs_magnitude", "mass", "rotation", "albedo" }); // do dimensions later
foreach (string param in PParameters)
{
switch (param)
{
case "abs_magnitude": PPPatterns.Add(param, @"\>absolute magnitude\<.+?size\=""2""\>([\d\.]+)\<.+?size\=""2""\>([a-z]+)\<.+?size\=""2""\>(n/a|[\d\.Ee\+-]+)\<"); break;
case "mass" : PPPatterns.Add(param, @"\>GM\<.+?size\=""2""\>([\d\.Ee\+-]+)\<.+?size\=""2""\>([\w\^/]+)\<.+?size\=""2""\>(n/a|[\d\.Ee\+-]+)\<"); break;
case "rotation": PPPatterns.Add(param, @"\>rotation period\<.+?size\=""2""\>([\d\.]+)\<.+?size\=""2""\>(h)\<.+?size\=""2""\>(n/a|[\d\.Ee\+-]+)\<"); break;
case "albedo" : PPPatterns.Add(param, @"\>geometric albedo\<.+?size\=""2""\>([\d\.]+)\<.+?size\=""2""\>([^\<]+)\<.+?size\=""2""\>(n/a|[\d\.Ee\+-]+)\<"); break;
}
switch (param) // separated for readability & editability
{
case "abs_magnitude": PPUnitsExpected.Add(param, @"mag"); PPUnitsWP.Add(param, @""); break; // Preassign desired unit-display, in case all goes well.
case "mass" : PPUnitsExpected.Add(param, @"km^3/s^2"); PPUnitsWP.Add(param, @" kg"); break; // They won't get called unless the associated parm remains
case "rotation" : PPUnitsExpected.Add(param, @"h"); PPUnitsWP.Add(param, @""); break; // in PParameters as "good". Otherwise removed.
case "albedo" : PPUnitsExpected.Add(param, @" "); PPUnitsWP.Add(param, @""); break;
}
}
// find & store most Physical Parameters
for (int i = 0; i < PParameters.Count; i++)
{
Param = PParameters[i];
string Pattern = PPPatterns[Param];
Match mPattern = Regex.Match(PhysicalParameters, Pattern, RegexOptions.Singleline);
if (mPattern.Success)
{
PPValues.Add(Param, mPattern.Groups[1].Value);
PPValWP.Add( Param, mPattern.Groups[1].Value);
PPUnits.Add( Param, mPattern.Groups[2].Value);
PPErrors.Add(Param, mPattern.Groups[3].Value); // never changed
PPErrWP.Add( Param, mPattern.Groups[3].Value); // changed to "n/a" if/when consumed by {{Convert/val}}
string LeadingZero = (PPErrWP[Param].Substring(0,1) == ".") ? "0" : "";
PPErrWP[Param] = LeadingZero + PPErrWP[Param];
if (PPUnits[Param] != PPUnitsExpected[Param])
{
PParameters.Remove(Param);
if (skipEntirePageIfAnyJPLUnitsDontMatchExpected)
{
Summary += @" """ + PPUnitsExpected[Param] + @""" units expected for " + Param + @"; """ + PPUnits[Param] + @""" units received.";
Skip = true;
}
i--;
}
else if (Param == "rotation")
{
PPValWP[Param] = PPValues[Param] = PPValWP[Param].Trim('.');
double RotTime = Convert.ToDouble(PPValues[Param]);
double OneMinute = (1d/60d); // in hours
string DaysMinSec = "d";
if (RotTime < 1) DaysMinSec = (RotTime < OneMinute) ? "s" : "min";
if (PPErrWP[Param] == "n/a") PPValWP[Param] = @"{{Convert|" + PPValWP[Param] + @"|h|" + DaysMinSec + "|abbr=on|lk=on}}";
if (PPErrWP[Param] != "n/a")
{
PPValWP[Param] = @"{{Convert|" + PPValWP[Param] + @"|±|" + PPErrWP[Param] + @"|h|" + DaysMinSec + "|abbr=on|lk=on}}";
PPErrWP[Param] = "n/a"; // "n/a" ensures it won't accidentally be displayed later
}
}
else if (Param == "albedo" | Param == "abs_magnitude")
{
PPValWP[Param] = PPValues[Param] = PPValWP[Param].Trim('.');
if (PPErrWP[Param] == "n/a") ; // do nothing; just display the bare value
if (PPErrWP[Param] != "n/a")
{
PPValWP[Param] = @"{{val|" + PPValWP[Param] + @"|" + PPErrWP[Param] + @"}}";
PPErrWP[Param] = "n/a"; // "n/a" ensures it won't accidentally be displayed later
}
}
}
else if (!mPattern.Success)
{
PParameters.Remove(Param);
i--;
}
} // end for()
// find & store remaining Physical Parameters that need to be modified (diameter or extent -> dimensions)
string Extent_Pattern = @"\>extent\<.+?size\=""2""\>([\d\.x]+)\<.+?size\=""2""\>(k?m)\<.+?size\=""2""\>(n/a|[\d\.Ee\+-]+)\<";
Match mExtent = Regex.Match(PhysicalParameters, Extent_Pattern, RegexOptions.Singleline);
string Diameter_Pattern = @"\>diameter\<.+?size\=""2""\>([\d\.]+)\<.+?size\=""2""\>(k?m)\<.+?size\=""2""\>(n/a|[\d\.Ee\+-]+)\<";
Match mDiameter = Regex.Match(PhysicalParameters, Diameter_Pattern, RegexOptions.Singleline);
// use "extent" instead of "diameter", if possible
Param = "dimensions";
if (mDiameter.Success && !mExtent.Success)
{
PParameters.Add(Param);
PPValues.Add( Param, mDiameter.Groups[1].Value);
PPValWP.Add( Param, mDiameter.Groups[1].Value);
PPUnits.Add( Param, mDiameter.Groups[2].Value); // "km" or "m"
PPUnitsWP.Add(Param, "");
PPErrors.Add( Param, mDiameter.Groups[3].Value); // never changed to "n/a"
PPErrWP.Add( Param, mDiameter.Groups[3].Value); // changed to "n/a" if/when consumed by {{Convert/val}}
}
else if (mExtent.Success)
{
PParameters.Add(Param);
PPValWP.Add( Param, mExtent.Groups[1].Value.Replace('x', '×')); // 1x2x3 -> 1×2×3 km
PPUnits.Add( Param, mExtent.Groups[2].Value); // "km" or "m"
PPUnitsWP.Add(Param, "");
PPErrors.Add( Param, mExtent.Groups[3].Value); // never changed
PPErrWP.Add( Param, mExtent.Groups[3].Value); // changed to "n/a" if/when consumed by {{Convert}}
}
if (mDiameter.Success | mExtent.Success)
{
// check units
if (PPUnits[Param] == "km") PPUnitsWP[Param] = @" [[Kilometre|km]]";
else if (PPUnits[Param] == "m") PPUnitsWP[Param] = @" [[Metre|m]]";
else
{
PParameters.Remove(Param);
if (skipEntirePageIfAnyJPLUnitsDontMatchExpected)
{
Summary += " km or m units expected for " + Param + @"; """ + PPUnits[Param] + @""" units received.";
Skip = true;
}
}
// evaluate error
if (PPErrors[Param] != "n/a" && !string.IsNullOrEmpty(PPErrors[Param]))
{
PPValWP[Param] = @"{{val|" + PPValWP[Param] + @"|" + PPErrWP[Param] + @"|ul=" + PPUnits[Param] + @"}}";
PPErrWP[Param] = "n/a"; // "n/a" ensures it won't accidentally be displayed later
PPUnitsWP[Param] = "";
}
}
// no reason to round Physical Parameter Table values, since they all have a small number of figs
// add/update {{Infobox}} Physical Parameters ////////////////////////////
foreach (string param in PParameters)
{
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= param.Length) ? new String(' ', EqualAlignTo - param.Length) : "";
string S1234 = S12 + param + S3 + @"=" + S4;
if (param == "mass")
{
// calculate mass
decimal G = 0.0000000000667408m; // 6.7408e-11 m^3 kg^-1 s^-2
decimal GM = Convert.ToDecimal(PPValWP[param]);
decimal ConversionFactor = Convert.ToDecimal(Math.Pow(10.0, 9)); // GM = km^3/s^2 = 1e9 m^3/s^2
decimal M = GM * ConversionFactor / G; // GM = km^3/s^2 = 1e9 m^3/s^2
// format mass
string NewValue = M.ToString("e2"); // "1.23e+020"
Match mNewValue = Regex.Match(NewValue, @"([\d\.]+)[Ee][\+-]0(\d\d)");
NewValue = mNewValue.Groups[1].Value + @" {{e|" + mNewValue.Groups[2].Value + @"}}"; // "1.23 {{e|20}}"
PPValWP["mass"] = NewValue;
}
if (param == "mass" | param == "abs_magnitude" | param == "rotation" | param == "albedo" | param == "dimensions")
{
bool PrependJPLRefs = true; // append or prepend jpldata to parameter (JPL is usually ref #1, so default to prepend)
string OldParamFull = @" *\|\s*" + param + @"\s*=[^\r\n]*";
string OldParam = @" *\|\s*" + param + @"\s*=[^\r\n\<]*";
string NewErrWP = (PPErrWP[param] != "n/a") ? " ± " + PPErrWP[param] : ""; // display only, not for comparison
string NewParam = S1234 + PPValWP[param] + NewErrWP + PPUnitsWP[param]; // display only, not for comparison
string NewValErr = PPValues[param].Trim() + @" " + PPErrors[param].Trim().Replace("n/a", ""); // comparison only, not for display
string NewValErr_Pattern = PPValues[param].Trim() + @"[\s\+\-±\|]*" + PPErrors[param].Trim().Replace("n/a", ""); // comparison only, not for display
string NewRef = @"<ref name=" + RefName_jpl + @"/>";
Match mOldParamFull = Regex.Match(NewInfobox, OldParamFull);
NewValErr = NewValErr.Trim();
// abs_mag, rotation, etc. seem to frequently contain 1 or more non-jpl refs. Append/update JPL val and/or ref depending on what we find.
string AbsMagRef_Pattern = @" *\|\s*" + param + @"\s*= *([^\r\n]+)([, ]*)( *\<ref[^\<]*?(?:\</ref\>|/?\>)[^\r\n]*?)"; // 3 groups; preserve unicode   before <ref>
Match mAbsMagRef = Regex.Match(NewInfobox, AbsMagRef_Pattern, RegexOptions.IgnoreCase);
if (mAbsMagRef.Success) // ref(s) exist; dig deeper
{
// to do: even more preliminary...way earlier in the code, I need to move jpl master refs from display-parameters to orbit_ref, physical_ref, etc...
// wait til physical_ref is a parameter
// preliminary screening: for cases when jplref and another refvalue needs updating, but has multiple refs, one of which is jplref, then remove jplref and proceed as normal
string AbsMagJPLRefAnywhere_Pattern = @" *\|\s*" + param + @"\s*= *(.*?)([ ]*\<ref name\s*=\s*[^/\>]*(?:" + JPLRefNameVariants + @")[^/\>]*)"; // 2 groups; preserve leading spaces
Match mAbsMagJPLRefAnywhere = Regex.Match(NewInfobox, AbsMagJPLRefAnywhere_Pattern, RegexOptions.IgnoreCase);
if (mAbsMagJPLRefAnywhere.Success) // jplref exists; if it's adjacent to another ref, remove jplref
{
// needs to account for: <ref>...</ref>, <ref name="blah">...</ref>, <ref name="blah"/>, & {{efn}} for non-jpl masters. 4 groups each.
// 324 Bamberga was a good test case: can't see past multiline refs, yet
string AdjaRefLeft = @"( *\|\s*" + param + @"\s*=[^\r\n]*?)(\<ref.+?(?:/\>|/ref\>)|\{\{\s*(?:efn)[^\}]+\}\})([ ]*)(\<ref name\s*=\s*[^/\>]*(?:" + JPLRefNameVariants + @")[^/\>]*/\>)";
string AdjaRefRight = @"( *\|\s*" + param + @"\s*=[^\r\n]*?)(\<ref name\s*=\s*[^/\>]*(?:" + JPLRefNameVariants + @")[^/\>]*/\>)([ ]*)(\<ref.+?(?:/\>|/ref\>)|\{\{\s*(?:efn)[^\}]+\}\})";
Match mAdjaRefLeft = Regex.Match(NewInfobox, AdjaRefLeft, RegexOptions.IgnoreCase | RegexOptions.Singleline);
if (mAdjaRefLeft.Success) NewInfobox = Regex.Replace(NewInfobox, AdjaRefLeft, @"$1$2", RegexOptions.IgnoreCase);
Match mAdjaRefRight = Regex.Match(NewInfobox, AdjaRefRight, RegexOptions.IgnoreCase | RegexOptions.Singleline);
if (mAdjaRefRight.Success) NewInfobox = Regex.Replace(NewInfobox, AdjaRefRight, @"$1$3$4", RegexOptions.IgnoreCase);
}
// proceed as normal (redo matches first, in case preliminary screening was successful)
AbsMagJPLRefAnywhere_Pattern = @" *\|\s*" + param + @"\s*= *(.*?)([ ]*\<ref name\s*=\s*[^/\>]*(?:" + JPLRefNameVariants + @")[^/\>]*)"; // 2 groups; preserve leading spaces
mAbsMagJPLRefAnywhere = Regex.Match(NewInfobox, AbsMagJPLRefAnywhere_Pattern, RegexOptions.IgnoreCase);
mOldParamFull = Regex.Match(NewInfobox, OldParamFull);
// proceed as normal
if (!mAbsMagJPLRefAnywhere.Success) // ref(s) exists, but none are JPL; prepare to update/append val and/or jplref to |param=
{
// try to find JPL values already referenced in WP param
Match mFindPPValErr = Regex.Match(mOldParamFull.Value, NewValErr_Pattern, RegexOptions.IgnoreCase); // find the JPL val+error in WP param
Match mFindWPValErr = Regex.Match("1", "2"); // want false by default; used to find the WP val+error in JPL
Match mWPValErr = Regex.Match(mOldParamFull.Value, @"(" + PPValues[param].Trim() + @")[\s\+\-±\|]*([\d\.,]*)"); // 2 groups
if (mWPValErr.Success)
{
string WPValErr_Pattern = mWPValErr.Groups[1] + @"[\s\+\-±\|]*" + mWPValErr.Groups[2];
mFindWPValErr = Regex.Match(NewValErr, WPValErr_Pattern); // find the WP val+error in JPL
}
if (mFindPPValErr.Success && mFindWPValErr.Success) // non-JPL refs, but identical val+err found, update val+err display & prepend/append jplref (GenFixes fixes order)
{
// OldParam is (should be) independent of the existance of <br>'s
// if inadequate: isolate the entire param line here first, then operate, instead of doing the old param/newparam replacement later
OldParam = @" *\|\s*" + param + @"\s*= *([^\r\n]*?(?:}}|\>)?[hrdays ]*)(\{\{\s*(?:val|convert)[^\}]+)?" + NewValErr_Pattern + @"([^\}\{]*\}\})?" + // groups 1-3 this line
@"([^\r\n\<\{]*?)((?:\{\{\s*efn|\{\{\s*small)[^\{\}]*\}\})?([ ]*)(\{\{\s*cit[ea][ t]|\{\{\s*small|\{\{\s*efn|\<ref|\<ref(?: name=[^\>]+)?\>\s*\{\{)([^\{\}\<\>]*?)(\}\}\s*\</ref\>|\}\}|\<?/ref\>|/\>)([^\r\n]*(?:[\<\r\n]))"; // groups 4-10
NewParam = S1234 + @"${1}" + PPValWP[param] + NewErrWP + @"$5$6$7$8$9" + NewRef + @"$10";
}
else // non-JPL refs, no identical values; therefore, prepend/append val & jplrefto EoL
{
// greedy version of AbsMagRef_Pattern + {{efn}}; propagate   ($3); 7 groups
OldParam = @" *\|\s*" + param + @"\s*= *([^\r\n]+?)(?:([, ]*)(\{\{\s*(?:efn|small)[^\}]+\}\}))?([, ]*)( *)((?:\<ref(?!(?: name=[^\>]*|\>) *[\r\n]+)[^\<]*?|\<ref\>\s*\{\{\s*cit[ea][ t][^\<]*?)(?:\</ref\>|/?\>))(" + "[\r\n]+)"; // $3= 
Match mShittyRegex = Regex.Match (NewInfobox, OldParam, RegexOptions.IgnoreCase); // in case the previous OldParam's \n-search fails
if (!mShittyRegex.Success) // (may no longer be necessary b/c using [\r\n] is more robust)
OldParam = @" *\|\s*" + param + @"\s*= *([^\r\n]+?)(?:([, ]*)(\{\{\s*(?:efn|small)[^\}]+\}\}))?([, ]*)( *)((?:\<ref(?!(?: name=[^\>]*|\>) *[\r\n]+)[^\<]*?|\<ref\>\s*\{\{\s*cit[ea][ t][^\<]*?)(?:\</ref\>|/?\>)[^\r\n]*)(" + "[\r\n]+)"; // $3= , also 7 groups
Match mOldParam = Regex.Match(NewInfobox, OldParam, RegexOptions.IgnoreCase);
Match mFindBr = Regex.Match(NewInfobox, @" *\|\s*" + param + @"\s*= *([^\r\n]*?\<br)", RegexOptions.IgnoreCase);
Match mFindComma = Regex.Match(NewInfobox, @" *\|\s*" + param + @"\s*= *([^\r\n]*?\,(\<ref|\{\{\s*efn|\{\{\s*cite))", RegexOptions.IgnoreCase);
bool shortValueParams = (param == "abs_magnitude") ? true : false;
if (mFindBr.Success | (!mFindComma.Success && !shortValueParams)) // conform-to or instate <br>-separation, except for short-value params (cleaner IB rendering)
{
if ( PrependJPLRefs) NewParam = S1234 + PPValWP[param] + NewErrWP + ThinspRefs + NewRef + @"<br />$1$2$3$4$5$6" + "\n";
if (!PrependJPLRefs) NewParam = S1234 + @"$1$2$3$4$5$6<br />" + PPValWP[param] + NewErrWP + ThinspRefs + NewRef + "\n";
}
else if (mFindComma.Success | (!mFindBr.Success && shortValueParams)) // ","-separate refs
{
if (string.IsNullOrEmpty(mOldParam.Groups[3].Value))
{
if ( PrependJPLRefs) NewParam = S1234 + PPValWP[param] + NewErrWP + "," + NewRef + @" $1$2$3$4$5$6" + "\n"; // "," instead of ThinspRefs
if (!PrependJPLRefs) NewParam = S1234 + @"$1,$6 " + PPValWP[param] + NewErrWP + ThinspRefs + NewRef + "\n"; // ThinspRefs instead of ","
}
else
{
if ( PrependJPLRefs) NewParam = S1234 + PPValWP[param] + NewErrWP + "," + NewRef + @" $1$2$3$4$5$6" + "\n"; // "," instead of ThinspRefs
if (!PrependJPLRefs) NewParam = S1234 + @"$1,$3$5$6 " + PPValWP[param] + NewErrWP + ThinspRefs + NewRef + "\n"; // ThinspRefs instead of ","
}
}
}
}
else if (mAbsMagJPLRefAnywhere.Success) // JPL ref exists! prepare to update val only
{
string Spacer = "";
string AbsMagJPLRefObvious_Pattern = @" *\|\s*" + param + @"\s*= *(.*?)((?<=\b)(?<!\:)\d[\d\.,]*|[\.,]\d+)[^\r\n\<\{\}\>]*?[ ]*(\<ref name\s*=\s*[^/\>]*(?:" + JPLRefNameVariants + @")[^/\>]*)"; // 3 groups; preserve thinsp
Match mAbsMagJPLRefObvious = Regex.Match(NewInfobox, AbsMagJPLRefObvious_Pattern, RegexOptions.IgnoreCase);
if (mAbsMagJPLRefObvious.Success) // straight-forward update (finally...)
{
if (mAbsMagJPLRefObvious.Groups[1].Value.Length > 0) Spacer = " ";
OldParam = AbsMagJPLRefObvious_Pattern;
NewParam = S1234 + @"${1}" + Spacer + PPValWP[param] + NewErrWP + ThinspRefs + @"$3"; // updates val only
}
else if (!mAbsMagJPLRefObvious.Success)
{
// value might be behind multiple refs
// value might be behind val/convert and/or in front of efn/small (7 groups)
string Templates_Pattern = @" *\|\s*" + param +
@"\s*= *(.*?)(\{\{\s*(?:val|convert)[^\d\r\n]*?)?((?:(?<=\b)(?<!\:)\d[\d\.,]*|[\.,]\d+)(?:\s*[\|\+\-±]+\s*)?[\d\.,]*)([^\}\{\r\n\<]*?)(\}\})?[^\r\n\d\<\{\}\>]*?([ ]*\{\{\s*(?:efn|small)[^\}\r\n]+\}\})?([ ]*\<ref name\s*=\s*[^\>]*(?:" + JPLRefNameVariants + @")[^\>]*\>)"; // 398 Admete was a good test case
Match mTemplates = Regex.Match(NewInfobox, Templates_Pattern, RegexOptions.IgnoreCase);
// last resort; 4 groups, remove $3; preserve thinsp
string AbsMagJPLRefBurried_Pattern = @" *\|\s*" + param +
@"\s*= *(.*?)((?<=\b)(?<!\:)[\d\.\s\|\+\-±,]+?)([ ]*[^\r\n\d\<\{\}\>]*?)[ ]*(\<ref name\s*=\s*[^/\>]*(?:" + JPLRefNameVariants + @")[^/\>]*)";
Match mAbsMagJPLRefBurried = Regex.Match(NewInfobox, AbsMagJPLRefBurried_Pattern, RegexOptions.IgnoreCase);
if (mTemplates.Success) // update value surrounded by templates
{
OldParam = Templates_Pattern;
NewParam = S1234 + @"${1}" + PPValWP[param] + NewErrWP + @"$6$7";
}
else if (mAbsMagJPLRefBurried.Success) // burried update (multiple refs & values)
{
if (mAbsMagJPLRefBurried.Groups[1].Value.Length > 0) Spacer = " ";
OldParam = AbsMagJPLRefBurried_Pattern;
NewParam = S1234 + @"${1}" + Spacer + PPValWP[param] + NewErrWP + ThinspRefs + @"$4"; // updates val only
}
else Summary += @"*"; // too unusual/very complicated; take chances with the default OldParam & NewParam; use "*" to track these
}
}
// perform the update/append
NewInfobox = Regex.Replace(NewInfobox, OldParam, NewParam, RegexOptions.IgnoreCase);
}
else if (!mAbsMagRef.Success) // no <ref>s: check for {{efn|cite }}s and <br>s and append val accordingly to either infobox or |param=
{
string EfnCiteAnywhere_Pattern = @" *\|\s*" + param + @"\s*= *(.*?)( *\{\{\s*(?:efn|cit[ea][ t])[^\}]+\}\})"; // 2 groups; preserve thinsp
Match mEfnCiteAnywhere = Regex.Match(NewInfobox, EfnCiteAnywhere_Pattern, RegexOptions.IgnoreCase);
if (mEfnCiteAnywhere.Success) // no <ref>s, only {{efn|cite }}; prepare to update/append val and/or jplref to |param=
{
// try to find JPL values already referenced in WP param
Match mFindPPValErr = Regex.Match(mOldParamFull.Value, NewValErr_Pattern, RegexOptions.IgnoreCase); // find the JPL val+error in WP param
Match mFindWPValErr = Regex.Match("1", "2"); // want false by default; used to find the WP val+error in JPL
Match mWPValErr = Regex.Match(mOldParamFull.Value, @"(" + PPValues[param].Trim() + @")[\s\+\-±\|]*([\d\.,]*)"); // 2 groups
if (mWPValErr.Success)
{
string WPValErr_Pattern = mWPValErr.Groups[1] + @"[\s\+\-±\|]*" + mWPValErr.Groups[2];
mFindWPValErr = Regex.Match(NewValErr, WPValErr_Pattern); // find the WP val+error in JPL
}
if (mFindPPValErr.Success && mFindWPValErr.Success) // no <ref>s, only {{efn|cite }}, but identical val+err found, update val+err display & append jplref
{
// OldParam is (should be) independent of the existance of <br>'s
// if inadequate: isolate the entire param line here first, then operate, instead of doing the old param/newparam replacement later
OldParam = @" *\|\s*" + param + @"\s*= *([^\r\n]*?(?:}}|\>)?[hrdays ]*)(\{\{\s*(?:val|convert)[^\}]+)?" + NewValErr_Pattern + @"([^\}\{]*\}\})?" + // groups 1-3 this line
@"([^\r\n\<\{]*?)((?:\{\{\s*efn|\{\{\s*small)[^\{\}]*\}\})?([ ]*)(\{\{\s*cit[ea][ t]|\{\{\s*small|\{\{\s*efn)([^\r\n\{\}\<\>]*?)(\}\})([^\r\n]*(?:[\<\r\n]))"; // groups 4-10 (ref rem'd)
NewParam = S1234 + @"${1}" + PPValWP[param] + NewErrWP + @"$5$6$7$8$9$6" + NewRef + @"$10";
}
else // no <ref>s, only {{efn|cite }}, no identical values; therefore, append val & jplrefto EoL
{
// greedy version of AbsMagRef_Pattern + {{efn|cite }}; propagate   ($3); 6 groups
OldParam = @" *\|\s*" + param + @"\s*= *([^\r\n]*)([, ]*)( *)(\{\{\s*(?:efn|cit[ea][ t])[^\}\r\n]+\}\})()(" + "[\r\n]+" + @")"; // $3= 
Match mShittyRegex = Regex.Match(NewInfobox, OldParam, RegexOptions.IgnoreCase); // in case OldParam fails
if (!mShittyRegex.Success) OldParam = @" *\|\s*" + param + @"\s*= *([^\r\n]*)([, ]*)( *)(\{\{\s*(?:efn|cit[ea][ t])[^\}\r\n]+\}\})([^\r\n]*)(" + "[\r\n]+" + @")"; // $3= 
Match mFindBr = Regex.Match(NewInfobox, @" *\|\s*" + param + @"\s*= *([^\r\n]*\<br)", RegexOptions.IgnoreCase);
if ( mFindBr.Success) NewParam = S1234 + @"$1$4$5<br />" + PPValWP[param] + NewErrWP + ThinspRefs + NewRef + "\n"; // "<br />"-separate refs
if (!mFindBr.Success) NewParam = S1234 + @"$1,$4$5 " + PPValWP[param] + NewErrWP + ThinspRefs + NewRef + "\n"; // ","-separate refs
}
NewInfobox = Regex.Replace(NewInfobox, OldParam, NewParam, RegexOptions.IgnoreCase | RegexOptions.Multiline);
}
else if (!mEfnCiteAnywhere.Success) // no <ref>s, no {{efn/cite }}; nothing worth saving (probably); just replace the whole param line
{
NewParam = NewParam.TrimEnd();
Match mParam = Regex.Match(NewInfobox, OldParam);
if ( mParam.Success) NewInfobox = Regex.Replace(NewInfobox, OldParam, NewParam); // replace the whole param line
if (!mParam.Success) NewInfobox = rAppendToInfobox.Replace(NewInfobox, @"$1" + NewParam + "\n" + @"$2"); // append to infobox
}
}
} // end if (mass | abs_mag | rotation | albedo | dimensions)
else // catch any leftover params in PParameters & add/update IB
{
string OldParam = @" *\|\s*" + param + @"\s*=[^\r\n\<]*";
string NewErrWP = (PPErrWP[param] != "n/a") ? " ± " + PPErrWP[param] : "";
string NewParam = S1234 + PPValWP[param] + NewErrWP + PPUnitsWP[param];
NewParam = NewParam.TrimEnd();
Match mParam = Regex.Match(NewInfobox, OldParam);
if ( mParam.Success) NewInfobox = Regex.Replace(NewInfobox, OldParam, NewParam);
if (!mParam.Success) NewInfobox = rAppendToInfobox.Replace(NewInfobox, @"$1" + NewParam + "\n" + @"$2");
}
} // end foreach (string param in PParameters) loop
bool debug4 = false; // 2 Pallas has all of these, for debugging
if (debug4)
{
// see http://en.wiki.x.io/w/index.php?title=User:Tom.Reding/JPL_to_Infobox_planet&oldid=718933095
// for PParameters, PPValWP, PPUnits, & PPErrWP debugging display
}
} // end Physical Parameters Table
// Additional Information -> Infobox ////////////////////////////////////////
// isolate the "Additional Information" table & save components
string AI_Pattern = @"Additional\s+Information.+?\</table\>";
Match mAI = Regex.Match(JPL_ExternalText, AI_Pattern, RegexOptions.Singleline);
if (mAI.Success && !Skip)
{
string AI = mAI.Value;
string EarthMOID_Pattern = @"\>Earth MOID\</a\> = ([\d\.Ee\+-]+) ([a-z]+) ";
Match mEarthMOID = Regex.Match(AI, EarthMOID_Pattern, RegexOptions.Singleline);
if (mEarthMOID.Success)
{
Param = "moid";
string EarthMOID = mEarthMOID.Groups[1].Value;
string EarthMOIDUnits = mEarthMOID.Groups[2].Value;
if (EarthMOID.Substring(0, 1) == ".") EarthMOID = "0" + EarthMOID;
if (EarthMOIDUnits == "au") EarthMOIDUnits = "AU";
else if (skipEntirePageIfAnyJPLUnitsDontMatchExpected)
{
Summary += @" AU units expected for " + Param + @"; non-AU units received.";
Skip = true;
}
// keep JPL precision since there's no error given for these values
string OldParam = @" *\|\s*" + Param + @"\s*=[^\r\n\<]*";
Match mParam = Regex.Match(NewInfobox, OldParam);
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= Param.Length) ? new String(' ', EqualAlignTo - Param.Length) : "";
// use {{Convert}} since apo, peri, semimajor have it (no error values given on JPL for MOIDs)
string MOIDsuffix = "";
double dEarthMOID = Convert.ToDouble(EarthMOID);
if (dEarthMOID < 0.01) MOIDsuffix = @"|km|abbr=on}}"; // ~3.9 LD
else if (dEarthMOID < 6.68459) MOIDsuffix = @"|Gm|abbr=on}}";
else MOIDsuffix = @"|Tm|abbr=on}}";
string NewParam = S12 + Param + S3 + "=" + S4 + @"{{Convert|" + EarthMOID + "|" + EarthMOIDUnits + MOIDsuffix;
// add to/update infobox
if ( mParam.Success) NewInfobox = Regex.Replace(NewInfobox, OldParam, NewParam);
if (!mParam.Success) NewInfobox = rAppendToInfobox.Replace(NewInfobox, @"$1" + NewParam + "\n" + @"$2");
}
string JupiterMOID_Pattern = @"\>Jupiter MOID\</a\> = ([\d\.Ee\+-]+) ([a-z]+) ";
Match mJupiterMOID = Regex.Match(AI, JupiterMOID_Pattern, RegexOptions.Singleline);
if (mJupiterMOID.Success)
{
Param = "jupiter_moid";
string JupiterMOID = mJupiterMOID.Groups[1].Value;
string LeadingZero = (JupiterMOID.Substring(0, 1) == ".") ? "0" : "";
JupiterMOID = LeadingZero + JupiterMOID;
string JupiterMOIDUnits = mJupiterMOID.Groups[2].Value;
if (JupiterMOIDUnits == "au") JupiterMOIDUnits = "AU";
else if (skipEntirePageIfAnyJPLUnitsDontMatchExpected)
{
Summary += @" AU units expected for " + Param + @"; non-AU units received.";
Skip = true;
}
// use {{Convert}} since apo, peri, semimajor have it (no error values given on JPL for MOIDs)
string MOIDsuffix = "";
double dJupiterMOID = Convert.ToDouble(JupiterMOID);
if (dJupiterMOID < 0.01) MOIDsuffix = @"|km|abbr=on}}"; // 0.01 AU = ~3.9 LD
else if (dJupiterMOID < 6.68459) MOIDsuffix = @"|Gm|abbr=on}}";
else MOIDsuffix = @"|Tm|abbr=on}}";
// add to/update infobox
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= Param.Length) ? new String(' ', EqualAlignTo - Param.Length) : "";
string OldParam = @" *\|\s*" + Param + @"\s*=[^\r\n\<]*";
string NewParam = S12 + Param + S3 + "=" + S4 + @"{{Convert|" + JupiterMOID + "|" + JupiterMOIDUnits + MOIDsuffix;
Match mParam = Regex.Match(NewInfobox, OldParam);
if ( mParam.Success) NewInfobox = Regex.Replace(NewInfobox, OldParam, NewParam);
if (!mParam.Success) NewInfobox = rAppendToInfobox.Replace(NewInfobox, @"$1" + NewParam + "\n" + @"$2");
}
string TisserandJup_Pattern = @"\>T_jup\</a\> = ([\d\.Ee\+-]+) ";
Match mTisserandJup = Regex.Match(AI, TisserandJup_Pattern, RegexOptions.Singleline);
if (mTisserandJup.Success)
{
Param = "tisserand";
string TisserandJup = mTisserandJup.Groups[1].Value;
string LeadingZero = (TisserandJup.Substring(0, 1) == ".") ? "0" : "";
TisserandJup = LeadingZero + TisserandJup;
// add to/update infobox
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= Param.Length) ? new String(' ', EqualAlignTo - Param.Length) : "";
string OldParam = @" *\|\s*" + Param + @"\s*=[^\r\n\<]*";
string NewParam = S12 + Param + S3 + "=" + S4 + TisserandJup;
Match mParam = Regex.Match(NewInfobox, OldParam);
if ( mParam.Success) NewInfobox = Regex.Replace(NewInfobox, OldParam, NewParam);
if (!mParam.Success) NewInfobox = rAppendToInfobox.Replace(NewInfobox, @"$1" + NewParam + "\n" + @"$2");
}
} // end Additional Information
// cleanup/finalize /////////////////////////////////////////////////////////
// replace unicode tabs with single space
NewInfobox = Regex.Replace(NewInfobox, @" ", " ");
// force spacing conformity on several frequently-malformed params that this code otherwise doesn't touch
List<string> ConformParams = AllParamList; // choose either FrequentlyMalformedParamList (6), UpdatableParamList (~27), AlmostAllParamList (~107), or AllParamList (~113)
foreach (string param in ConformParams)
{
string ConformParam_Pattern = @"(?<!efn)(?<=[\r\n]) *\| *" + param + @"([ ]*)\=[ ]*";
string ConformParamNL_Pattern = @"(?<!efn)(?<=[\r\n]) *\| *" + param + @"([ ]*)\=([\r\n]+)";
Match mConformParam = Regex.Match(NewInfobox, ConformParam_Pattern, RegexOptions.IgnoreCase);
Match mConformParamNL = Regex.Match(NewInfobox, ConformParamNL_Pattern, RegexOptions.IgnoreCase);
if (mConformParam.Success)
{
string s3 = mConformParam.Groups[1].Value;
string Spacer = (s3.Length <= 1) ? s3 : "";
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= param.Length) ? new String(' ', EqualAlignTo - param.Length) : Spacer;
string NewParam = S12 + param + S3 + "=" + S4;
NewInfobox = Regex.Replace(NewInfobox, ConformParam_Pattern, NewParam);
}
else if (mConformParamNL.Success) // have to handle "=\n" differently than "= \n", b/c regex... (or not, if \r\n works)
{
string s3 = mConformParamNL.Groups[1].Value;
string Spacer = (s3.Length <= 1) ? s3 : "";
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= param.Length) ? new String(' ', EqualAlignTo - param.Length) : Spacer;
string NewParam = S12 + param + S3 + "=" + S4 + "\n";
NewInfobox = Regex.Replace(NewInfobox, ConformParamNL_Pattern, NewParam);
}
}
// trim end whitespace from parameters, except after "=" (chr32 (keyboard space), chr160 ( ),  )
string EndWhitespace_Pattern = @"([^\r\n]*?)[ ]+(?<!\=[ ]*)(" + Environment.NewLine + ")";
Match mEndWhitespace = Regex.Match(NewInfobox, EndWhitespace_Pattern);
if (mEndWhitespace.Success) NewInfobox = Regex.Replace(NewInfobox, EndWhitespace_Pattern, @"$1$2");
// move lines starting w "<ref" or "</ref" up 1 line
// done once (preliminarily) near the beginning, and again here, in case any accidental \n's added
RefLine_Pattern = "[\r\n]+" + @"([ ]*\</?ref)"; // leading space & thinsp
mRefLine = Regex.Match(NewInfobox, RefLine_Pattern, RegexOptions.IgnoreCase);
if (mRefLine.Success) NewInfobox = Regex.Replace(NewInfobox, RefLine_Pattern, @"$1", RegexOptions.IgnoreCase);
// remove Infobox lines that don't start with "|", " |", "{{", "}}", or " }}"
string UnpipedLine_Pattern = @"[\r\n]+(?![ ]*\||\{|[ ]*\})";
Match mUnpipedLine = Regex.Match(NewInfobox, UnpipedLine_Pattern);
if (mUnpipedLine.Success) NewInfobox = Regex.Replace(NewInfobox, UnpipedLine_Pattern, "");
// move mean_radius below dimensions (per documentation) since they're closely related
string Dim_Pattern = @"( *\| *dimensions *=([\=^\r^\n]+))" + "[\r\n]+";
string M_R_Pattern = @"( *\| *mean_radius *=([\=^\r^\n]+))" + "[\r\n]+";
Match mDim = Regex.Match(NewInfobox, Dim_Pattern);
Match mM_R = Regex.Match(NewInfobox, M_R_Pattern);
if (mM_R.Success && mDim.Success)
{
NewInfobox = Regex.Replace(NewInfobox, M_R_Pattern, ""); // cut mean_radius
NewInfobox = Regex.Replace(NewInfobox, Dim_Pattern, mDim.Groups[1].Value + "\n" + mM_R.Groups[1].Value + "\n"); // paste mean_radius below dim
}
// remove <!--(Infobox Comments)--> from non-empty parameters
List<string> CommentsList = new List<string>(new string[] { // regex
@"enables features for minor planets", // switches
@"on top of infobox", // name/designation/image
@"Minor planet name",
@"Minor planet category",
@"Alternative names",
@"Any alternative names for the body",
@"displayed in caption of infobox",
@"image caption",
@"\<ref\>\.\.\.\</ref\>", // discovery info
@"Date",
@"person\(s\), survey",
@"''discover_site''",
@"Average orbital speed", // orbital/physical parameters
@"Instantaneous orbital mean motion",
@"Angular distance",
@"Longitude of (the )?ascending node",
@"Longitude of peri[a-z]*",
@"Time of peri[a-z]*",
@"Argument of peri[a-z]*",
@"Earth MOID only",
@"Jupiter Tisserand parameter",
@"Proper orbital mean motion",
@"Proper perihelic precession rate",
@"Proper nodal precession rate",
@"Uni/bi/tri\-axial dimensions",
@"For planets & large, spheroidal minor planets",
@"Equatorial surface gravity",
@"Moment of inertia factor",
@"Rotational velocity",
@"Rotation period",
@"North pole right ascension",
@"North pole declination",
@"Pole ecliptic latitude",
@"Pole ecliptic longitude",
@"Apparent mag(nitude)?",
@"Absolute mag(nitude)?",
@"see ""All parameters"" section" // other
});
foreach (string Comment in CommentsList)
{
// non-empty = at least 1 non-space, non-"?" character before comment
string Comment_Pattern = @"( *=[ ]*.{1,})(?<!\=[ ]*\??[ ]*)\<\!\-\-\ *\(? *" + Comment + @" *\)? *\-\-\>"; // thinsp's incl'd
Match mComment = Regex.Match(NewInfobox, Comment_Pattern, RegexOptions.IgnoreCase);
if (mComment.Success) NewInfobox = Regex.Replace(NewInfobox, Comment_Pattern, @"$1", RegexOptions.IgnoreCase);
}
// Infobox-only thinspace standardization (unicode ->  )
NewInfobox = NewInfobox.Replace(" ", @" "); // unicoded  s are hard to see & could get lost; convert back to have html
// standardize   usage in headings (*_ref parameters)
if (!string.IsNullOrEmpty(ThinspHeadings) && !Skip) // if  's are used at the start of any *_ref parameter, apply to other *_ref parameters
{
List<string> RefParamList = new List<string>(new string[] { "orbit_ref", "p_orbit_ref", "discovery_ref" });
foreach (string param in RefParamList)
{
string NonEmptyParam_Pattern = @" *\|\s*" + param + @"\s*= *( )?([^\r\n]{5,})"; // ">= 5" since "<ref>" = 5 characters long
Match mNonEmptyParam = Regex.Match(NewInfobox, NonEmptyParam_Pattern, RegexOptions.IgnoreCase);
if (mNonEmptyParam.Success)
{
if (EqualAlignTo > 0) S3 = (EqualAlignTo >= param.Length) ? new String(' ', EqualAlignTo - param.Length) : "";
string NewParam = S12 + param + S3 + "=" + S4 + @" " + mNonEmptyParam.Groups[2].Value;
NewInfobox = Regex.Replace(NewInfobox, NonEmptyParam_Pattern, NewParam);
}
}
}
// standardize   usage b/w values & refs (only in the positive direction; i.e. don't remove any  s)
if (!string.IsNullOrEmpty(ThinspRefs) && !Skip) // if  's are used > threshold% b/w values and refs, then use 100% of the time
{
string ValRef_Pattern = @"(\{\{(?:Convert|Val|Small|Deg2DMS|Deg2HMS|e\|)[^\}]+\}\}|\]\]|\)|\d+|['""a-z])\s*(\<ref|\{\{efn)";
Match mValRef = Regex.Match(NewInfobox, ValRef_Pattern, RegexOptions.IgnoreCase);
if (mValRef.Success) NewInfobox = Regex.Replace(NewInfobox, ValRef_Pattern, @"$1 $2", RegexOptions.IgnoreCase);
}
bool debug5 = false;
if (debug5) Summary += @" ThinspUsageFraction: " + string.Format("{0:N2}", ThinspUsageFraction*100.0) + "%"; // fyi/debug
// remove Infobox lines with only whitespace + pipe
string EmptyPipes_Pattern = @"(?<=[\r\n]+)([ ]*\|[ ]*[\r\n]+)";
Match mEmptyPipes = Regex.Match(NewInfobox, EmptyPipes_Pattern, RegexOptions.Multiline);
if (mEmptyPipes.Success) NewInfobox = Regex.Replace(NewInfobox, EmptyPipes_Pattern, "");
// save infobox changes /////////////////////////////////////////////////////
ArticleText = ArticleText.Replace(OldInfobox, NewInfobox); //////////////////
// split "-->{{Infobox planet" to separate lines
string CommentIB_Pattern = @"(\-\-\>[ ]*)(\{\{\s*(?:" + InfoboxAliases_Pattern + "))";
Match mCommentIB = Regex.Match(ArticleText, CommentIB_Pattern, RegexOptions.IgnoreCase);
if (mCommentIB.Success) ArticleText = Regex.Replace(ArticleText, CommentIB_Pattern, @"$1" + "\n" + @"$2", RegexOptions.IgnoreCase);
// ArticleText: add/update access-date for RefName_jpl
JPLRef_Pattern = @"(\<ref name\s*=\s*" + RefName_jpl + @"\s*\>([^\<]+?))\|?(\s*(?:}}|\])?\s*\</ref\>)"; // strict
mJPLRef = Regex.Match(ArticleText, JPLRef_Pattern, RegexOptions.IgnoreCase | RegexOptions.Singleline);
if (mJPLRef.Success && !Skip)
{
string JPLRef = mJPLRef.Value;
string SkipTemplates_Pattern = @"\{\{\s*(?:MPCit|JPL small body)"; // these don't have/accept an access-date parameter
Match mSkipTemplates = Regex.Match(JPLRef, SkipTemplates_Pattern, RegexOptions.IgnoreCase);
if (!mSkipTemplates.Success)
{
string AccessDate_Pattern = @"(\|\s*access-?date\s*=\s*)([^\|\}\<]*?)(\s*[\|\}\<])";
Match mAccessDate = Regex.Match(JPLRef, AccessDate_Pattern);
if (mAccessDate.Success) // update
{
string UpdateAccessDate = @"${1}" + DateTime.Today.ToString("d MMMM yyyy") + @"$3";
JPLRef = Regex.Replace(JPLRef, AccessDate_Pattern, UpdateAccessDate, RegexOptions.IgnoreCase | RegexOptions.Singleline);
}
if (!mAccessDate.Success) // append
{
// determine native spacing
string JPLref = mJPLRef.Groups[2].Value; // JPLref, not JPLRef
string Spacing_Pattern = @"(\s*)\|(\s*)[a-z-]+(\s*)=(\s*)";
string s1, s2, s3, s4;
s1 = s2 = s3 = s4 = "";
Match mSpacing = Regex.Match(JPLref, Spacing_Pattern);
if (mSpacing.Success)
{
s1 = mSpacing.Groups[1].Value;
s2 = mSpacing.Groups[2].Value;
s3 = mSpacing.Groups[3].Value;
s4 = mSpacing.Groups[4].Value;
}
string AppendAccessDate = @"$1" + s1 + @"|" + s2 + @"access-date" + s3 + @"=" + s4 + DateTime.Today.ToString("d MMMM yyyy") + @"$3";
JPLRef = Regex.Replace(JPLRef, JPLRef_Pattern, AppendAccessDate, RegexOptions.IgnoreCase | RegexOptions.Singleline);
}
ArticleText = Regex.Replace(ArticleText, JPLRef_Pattern, JPLRef, RegexOptions.IgnoreCase | RegexOptions.Singleline); // save
}
}
// remove JPL link from ==External links== as redundant since it is now linked-to in the JPL master ref (2 groups)
string JPLinEL_Pattern = @"(=+\s*(?:External +links?|References?|Notes?)\s*=+.*?(?!\=\=))([\r\n]+ *[\*\[]+[^\r\n]+?ssd.jpl.nasa.gov/sbdb\.cgi[^\r\n]+)";
Match mJPLinEL = Regex.Match(ArticleText, JPLinEL_Pattern, RegexOptions.IgnoreCase | RegexOptions.Singleline);
if (mJPLinEL.Success)
if (!Regex.Match(mJPLinEL.Groups[2].Value, @"horizons\.cgi", RegexOptions.IgnoreCase).Success)
if (!Regex.Match(mJPLinEL.Groups[2].Value, "(" + JPLRefNameVariants + ")", RegexOptions.IgnoreCase).Success)
ArticleText = Regex.Replace(ArticleText, JPLinEL_Pattern, @"$1");
// Correct "[[List of asteroids/1001–2000]]" and other mislink variants
string Mislink1_Pattern = @"\[\[\s*List of (?:asteroids|minor planets)[/: ]+(\d+001)[ ‐–—-]+(\d+000)\s*\]\]";
Match mMislink1 = Regex.Match(ArticleText, Mislink1_Pattern, RegexOptions.IgnoreCase);
if (mMislink1.Success) ArticleText = Regex.Replace(ArticleText, Mislink1_Pattern, @"[[List of minor planets: $1–$2]]", RegexOptions.IgnoreCase);
// Correct "[[List of asteroids/1–1000]]" and variants
string Mislink2_Pattern = @"\[\[\s*List of (?:asteroids|minor planets)[/: ]+(1)[ ‐–—-]+(1000)\s*\]\]";
Match mMislink2 = Regex.Match(ArticleText, Mislink2_Pattern, RegexOptions.IgnoreCase);
if (mMislink2.Success) ArticleText = Regex.Replace(ArticleText, Mislink2_Pattern, @"[[List of minor planets: $1–$2]]", RegexOptions.IgnoreCase);
// dummy-check, just in case
string DummyCheck = ArticleTitle_latin + JPL_DDate + Unique_ID + Param + RefName_jpl + JPLDataRef_Master +
JPLDataRef_MasterOrSlave + LowellRef_Master + LowellRef_MasterOrSlave + OldInfobox + NewInfobox;
int IndexOfDummy = DummyCheck.IndexOf("dummy");
if (IndexOfDummy != -1 && !Skip)
{
int start = (IndexOfDummy - 25 < 0) ? 0 : (IndexOfDummy - 25);
int end = (IndexOfDummy + 25 > DummyCheck.Length) ? DummyCheck.Length : (IndexOfDummy + 25);
int width = end - start;
Summary += "DummyCheck failed, dummy! Excerpt: " + DummyCheck.Substring(start, width);
Skip = true;
}
return ArticleText; // end
}