Closed Bug 1156135 Opened 9 years ago Closed 9 years ago

Add runtime testing of graphics features

Categories

(Core :: Graphics, defect)

41 Branch
All
Windows
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox39 + wontfix
firefox40 + wontfix
firefox41 + fixed

People

(Reporter: mattwoodrow, Assigned: mattwoodrow)

References

(Blocks 1 open bug)

Details

(Keywords: feature)

Attachments

(6 files, 17 obsolete files)

1.10 KB, patch
vladan
: review+
Details | Diff | Splinter Review
2.39 KB, patch
ajones
: review+
Details | Diff | Splinter Review
1.87 MB, image/png
Details
3.08 KB, text/plain
Details
27 bytes, text/plain
Details
10.66 KB, patch
mchang
: review+
dvander
: feedback+
Details | Diff | Splinter Review
https://wiki.mozilla.org/Performance/Runtime_Hardware_Testing

I'm working on an initial prototype of this as a firefox addon to make it easier to iterate on. I assume we'll want to bake it into firefox itself before shipping though.
First attempt is here: https://github.com/mattwoodrow/firefox-sanity-test

As it currently stands it would catch the black video on youtube issue, and would disable the hardware video decoder and restart the browser.

It probably wouldn't catch some of our black screen gfx issues because of the way it reads back from the GPU not being what actually shows up on screen.

My main work items from here:

* Add functionality to firefox to grab a snapshot of the window directly from the OS, rather than reading back from the backbuffer. This should be a much more reliable test to see if rendering is working.

* Disable features without flipping prefs. Flipping prefs is annoying because it's a permanent change and users won't ever get that feature again, even if we fix the bug that caused the test to fail. Ideally we'd have some sort of system that would disable the feature, but start allowing it again once we think it will probably work. I haven't thought much about how this would work yet.

* Add telemetry reporting of failures so we know which systems are having issues.
Vladan, any thoughts on which telemetry APIs I should be using to report these sort of failures?
Flags: needinfo?(vdjeric)
(In reply to Matt Woodrow (:mattwoodrow) from comment #2)
> Vladan, any thoughts on which telemetry APIs I should be using to report
> these sort of failures?

If your test is able to get Firefox to a working configuration, then this is simple:

1) confirm that Telemetry reports enough info in its ping.environment.gfx section. See https://ci.mozilla.org/job/mozilla-central-docs/Tree_Documentation/
2) declare an opt-out Telemetry histogram of type "enum" with one bucket for each test (current & future),
3) accumulate a value for each failed test, and perhaps accumulate a value in bucket 0 to indicate having run the test

If your test can't find a working Firefox configuration, we can craft & send a hand-rolled Telemetry ping to our servers using XHR, but this would involve more work.

What do you think?
Flags: needinfo?(vdjeric)
(In reply to Vladan Djeric (:vladan), please needinfo -- PTO April 25-May 3 from comment #3)
> (In reply to Matt Woodrow (:mattwoodrow) from comment #2)
> > Vladan, any thoughts on which telemetry APIs I should be using to report
> > these sort of failures?
> 
> If your test is able to get Firefox to a working configuration, then this is
> simple:
> 
> 1) confirm that Telemetry reports enough info in its ping.environment.gfx
> section. See
> https://ci.mozilla.org/job/mozilla-central-docs/Tree_Documentation/

Is it possible to get more of the gfx 'configuration' data reported here too? In particular, I think it'd be interesting to know which LayerManager backend is in use, whether OMTC is enabled, and (after bug 1151611 lands) if DXVA is enabled.


> 2) declare an opt-out Telemetry histogram of type "enum" with one bucket for
> each test (current & future),
> 3) accumulate a value for each failed test, and perhaps accumulate a value
> in bucket 0 to indicate having run the test
> 
> If your test can't find a working Firefox configuration, we can craft & send
> a hand-rolled Telemetry ping to our servers using XHR, but this would
> involve more work.
> 
> What do you think?

That sounds good, I'll have a play around with implementing it.
This is the pref the addon will set to disable video when the test fails.

It needs to be live so we can clear it when we run the test again.
Attachment #8596972 - Flags: review?(ajones)
The latest version of the addon (requires a build of firefox with these patches) is available at https://github.com/mattwoodrow/firefox-sanity-test
I've also got a development version of the addon which just shows a dialog box with the test results and uninstalls itself, rather than actually making any changes.

http://people.mozilla.org/~mwoodrow/sanity-test.xpi
Attachment #8596972 - Flags: review?(ajones) → review+
(In reply to Matt Woodrow (:mattwoodrow) from comment #4)
> Is it possible to get more of the gfx 'configuration' data reported here
> too? In particular, I think it'd be interesting to know which LayerManager
> backend is in use, whether OMTC is enabled, and (after bug 1151611 lands) if
> DXVA is enabled.

It's definitely possible and since the first two items are already exposed to JS via nsIGfxInfo, it's trivial to query that info from TelemetryEnvironment.jsm:

http://hg.mozilla.org/mozilla-central/annotate/86d3308ec888/toolkit/components/telemetry/TelemetryEnvironment.jsm#l1022

I'd normally write the patch, but I'm going to be on PTO starting today, can you file a bug and ask gfritzsche to help?

> That sounds good, I'll have a play around with implementing it.

Hand-crafting the Telemetry ping could be a pain, talk to gfritzsche or Benjamin Smedberg about this.
Comment on attachment 8596971 [details] [diff] [review]
Part 1: Add a histogram for reporting sanity test failures

Review of attachment 8596971 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/Histograms.json
@@ +7970,5 @@
>      "kind": "count",
>      "description": "Tracking whether a ServiceWorker spawn gets queued due to hitting max workers per domain limit"
> +  },
> +  "GRAPHICS_SANITY_TEST": {
> +    "expires_in_version": "never",

can you add an "alert_emails" field point it to some gfx team alias, e.g. create a gfx-telemetry-alerts@mozilla.com alias for the team in ServiceNow.
This will help us identify the owner of the histogram and you will receive automatic notifications when the distribution of data in the histogram changes drastically. The emails are very rare and the false positive rate is extremely low.

@@ +7972,5 @@
> +  },
> +  "GRAPHICS_SANITY_TEST": {
> +    "expires_in_version": "never",
> +    "kind": "enumerated",
> +    "n_values": 10,

Make n_values 20, it's a pain to add more buckets after the histogram is in production.

@@ +7973,5 @@
> +  "GRAPHICS_SANITY_TEST": {
> +    "expires_in_version": "never",
> +    "kind": "enumerated",
> +    "n_values": 10,
> +    "releaseChannelCollection": "opt-out",

r+ for making this histogram opt-out, I think the user benefit is clear in this case

@@ +7974,5 @@
> +    "expires_in_version": "never",
> +    "kind": "enumerated",
> +    "n_values": 10,
> +    "releaseChannelCollection": "opt-out",
> +    "description": "Reports results from the graphics sanity test to track which drivers are having problems"

List the enum values in the description
Attachment #8596971 - Flags: review?(vdjeric) → review+
Attached patch Part 3: Sanity test addon (obsolete) — Splinter Review
Attachment #8599570 - Flags: feedback?(gavin.sharp)
Attachment #8599570 - Flags: feedback?(gavin.sharp) → feedback?(dtownsend)
Comment on attachment 8599570 [details] [diff] [review]
Part 3: Sanity test addon

Review of attachment 8599570 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not sure what I need to give feedback on here. Is the intent to distribute this add-on through AMO or something?
To try and avoid the start-up penalty I'm wondering what it would take to run these tests in a standalone process that gets executed after a Firefox update is staged and before the browser restarts. We can still trigger and block on this process if a driver update is detected on start but this will kill the primary use case of checking on Firefox update. Of course this probably depends on how much the tests rely on the gfx code in Firefox.
(In reply to Dave Townsend [:mossop] from comment #12)
> Comment on attachment 8599570 [details] [diff] [review]
> Part 3: Sanity test addon
> 
> Review of attachment 8599570 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I'm not sure what I need to give feedback on here. Is the intent to
> distribute this add-on through AMO or something?

No, the intent is to ship it by default with firefox. I'm looking for feedback on whether it should stay as an addon, or be moved into firefox frontend code. If the latter, then advice on how to do so would be helpful too.
Comment on attachment 8599570 [details] [diff] [review]
Part 3: Sanity test addon

Review of attachment 8599570 [details] [diff] [review]:
-----------------------------------------------------------------

If the intent is to ship this to all users and its function is tightly tied to the specific Firefox version as seems likely then I don't think this should be an add-on. It also shouldn't be restarting immediately after startup, we'd be better off telling the user that we've detected a problem and need to restart to resolve it.

Putting the code in a script called from nsBrowserGlue seems sensible.
Attachment #8599570 - Flags: feedback?(dtownsend) → feedback-
(In reply to Dave Townsend [:mossop] from comment #15)
 
> If the intent is to ship this to all users and its function is tightly tied
> to the specific Firefox version as seems likely then I don't think this
> should be an add-on. It also shouldn't be restarting immediately after
> startup, we'd be better off telling the user that we've detected a problem
> and need to restart to resolve it.

If that test failed, then it's because our rendering is entirely broken and our windows are probably black (or possibly uninitialized). We won't have any way to tell the user about the problem, so restarting immediately (hopefully before the primary browser window has even opened) seems like the best option.


> 
> Putting the code in a script called from nsBrowserGlue seems sensible.

Ok, thanks, I'll have a read of that code and try figure out what to do.
(In reply to Matt Woodrow (:mattwoodrow) from comment #16)
> If that test failed, then it's because our rendering is entirely broken and
> our windows are probably black (or possibly uninitialized).

"Probably", or "definitely"? How confident are you in that test? Making a mistake and causing restart loops has the potential to be worse than the "black screen" bug.

Is there no lower-level place to put a test like this, rather than having the front-end go through canvas/getImageData?
(In reply to :Gavin Sharp [email: gavin@gavinsharp.com] from comment #17)
> (In reply to Matt Woodrow (:mattwoodrow) from comment #16)
> > If that test failed, then it's because our rendering is entirely broken and
> > our windows are probably black (or possibly uninitialized).
> 
> "Probably", or "definitely"? How confident are you in that test? Making a
> mistake and causing restart loops has the potential to be worse than the
> "black screen" bug.

Definitely. It also only restarts if it changed the pref, so it can't happen more than once.


> 
> Is there no lower-level place to put a test like this, rather than having
> the front-end go through canvas/getImageData?

This is by far the easiest way to get the information we need. We need a full OMTC Layers setup, and a range of different content (colour and video thus far), and trying to instantiate that manually in c++ would be difficult.
FWIW, this is the exact same way that the 'reftest' harness works, so is well tested.
On second thought, it might be nice to keep this as an addon, but just ship it by default with firefox.

This gives us a lot more flexibility since we'd be able to tweak/add new tests mid-cycle without having to wait for the next firefox release.

I think this extra flexibility could be really valuable, and I'm not aware of any major downsides to shipping an addon instead of adding it to the front end.

Dave, Gavin, what do you think?

I'd really like to make this happen for 39, as we're still having a lot of issues with black videos on youtube and this should be an effective fix.
Flags: needinfo?(gavin.sharp)
Flags: needinfo?(dtownsend)
(In reply to Matt Woodrow (:mattwoodrow) from comment #20)
> On second thought, it might be nice to keep this as an addon, but just ship
> it by default with firefox.
> 
> This gives us a lot more flexibility since we'd be able to tweak/add new
> tests mid-cycle without having to wait for the next firefox release.
> 
> I think this extra flexibility could be really valuable, and I'm not aware
> of any major downsides to shipping an addon instead of adding it to the
> front end.

It depends what you mean by shipping as an add-on. If you want to have this add-on installed by default for users and have it update independently of Firefox then you have to be ok with users also being able to uninstall the add-on and lose the functionality entirely. There is also the UX around forcing an add-on on users which sometimes causes unwanted surprise.

> Dave, Gavin, what do you think?
> 
> I'd really like to make this happen for 39, as we're still having a lot of
> issues with black videos on youtube and this should be an effective fix.

39 goes to beta on Tuesday, I'd recommend talking to the release drivers about what your options are with uplifting things. They are very risk averse right now.
Flags: needinfo?(dtownsend)
(In reply to Dave Townsend [:mossop] from comment #21)
 
> It depends what you mean by shipping as an add-on. If you want to have this
> add-on installed by default for users and have it update independently of
> Firefox then you have to be ok with users also being able to uninstall the
> add-on and lose the functionality entirely. There is also the UX around
> forcing an add-on on users which sometimes causes unwanted surprise.
> 

What are my options?

Ideally it wouldn't show anything to users, and wouldn't be able to be removed, but we could still push updates to it without a firefox release.


> 39 goes to beta on Tuesday, I'd recommend talking to the release drivers
> about what your options are with uplifting things. They are very risk averse
> right now.

Yeah, I know. Shipping this should help a lot with our video issues, and is very low risk so I don't think it'll be a problem. I can remove the code to restart the browser (and leave us with just the piece to disable hardware accelerated video) if necessary too.
Flags: needinfo?(dtownsend)
Sheila - we need to get this into release in the 39 timeframe. We've got the platform stuff right but it would help to find the right person in the Firefox org to put some time into getting something landed. After it is landed we can get it uplifted as far as release management will let us.
(In reply to Matt Woodrow (:mattwoodrow) from comment #22)
> (In reply to Dave Townsend [:mossop] from comment #21)
>  
> > It depends what you mean by shipping as an add-on. If you want to have this
> > add-on installed by default for users and have it update independently of
> > Firefox then you have to be ok with users also being able to uninstall the
> > add-on and lose the functionality entirely. There is also the UX around
> > forcing an add-on on users which sometimes causes unwanted surprise.
> > 
> 
> What are my options?
> 
> Ideally it wouldn't show anything to users, and wouldn't be able to be
> removed, but we could still push updates to it without a firefox release.

Your options are these:

1. Build the feature into Firefox (or like pdf.js, take the add-on's code and embed it into Firefox almost as if it is a built-in feature). It won't appear in the UI to users and they can't remove it. You can't update it mid-release though (except for hotfixes).

2. Ship as an application extension. The user can see it and disable it in the add-ons manager UI, they can't remove it though and you can't update it mid-release.

3. Ship as a distribution extension. The user can see it and disable it and remove it from the add-ons manager UI. You can ship updates mid-release.

So there isn't really a way to do everything you want. Bug 1071311 is about making the transition from add-on to feature easier, but it's not going to ship before 41 at the earliest and even then doesn't support automatic mid-release updates of features. So far that hasn't been something we've ever invested time in doing, the six week cycle with the odd security and hotfix release has been enough.
Flags: needinfo?(dtownsend)
Given that we can't allow it to be disabled/removed we're left with Hobson's choice being option 1.
Flags: needinfo?(smooney)
After further discussion with Matt Woodrow we want to go with option 3 and find someone to follow up with hiding it.
(In reply to Dave Townsend [:mossop] from comment #24)
> 
> 3. Ship as a distribution extension. The user can see it and disable it and
> remove it from the add-ons manager UI. You can ship updates mid-release.
> 
> So there isn't really a way to do everything you want. Bug 1071311 is about
> making the transition from add-on to feature easier, but it's not going to
> ship before 41 at the earliest and even then doesn't support automatic
> mid-release updates of features. So far that hasn't been something we've
> ever invested time in doing, the six week cycle with the odd security and
> hotfix release has been enough.

Thanks for explaining that!

As above, we think option 3 is our best bet for the mean time.

What do I need to do to move forward with this?
(In reply to Matt Woodrow (:mattwoodrow) from comment #27)
> (In reply to Dave Townsend [:mossop] from comment #24)
> > 
> > 3. Ship as a distribution extension. The user can see it and disable it and
> > remove it from the add-ons manager UI. You can ship updates mid-release.
> > 
> > So there isn't really a way to do everything you want. Bug 1071311 is about
> > making the transition from add-on to feature easier, but it's not going to
> > ship before 41 at the earliest and even then doesn't support automatic
> > mid-release updates of features. So far that hasn't been something we've
> > ever invested time in doing, the six week cycle with the odd security and
> > hotfix release has been enough.
> 
> Thanks for explaining that!
> 
> As above, we think option 3 is our best bet for the mean time.
> 
> What do I need to do to move forward with this?

Test Pilot used to ship like this, you just need to get the XPI into <appdir>/distribution/extensions/<id>.xpi (If the extensions needs to install unpacked then it needs to be in there unpacked instead). The makefile logic that used to do that is in here but I imagine these days you'll need to do something clever in moz.build. http://hg.mozilla.org/mozilla-central/file/3fc89f2888fe/browser/app/profile/extensions/Makefile.in
Also note that when it happens I'd like to look at your plans for hiding it from the user, we would need to make sure we do so in a way that other extensions cannot abuse.
Attached patch Part 3: Sanity test addon v2 (obsolete) — Splinter Review
Attachment #8599570 - Attachment is obsolete: true
Attachment #8604438 - Flags: feedback?(dtownsend)
(In reply to Dave Townsend [:mossop] from comment #29)
> Also note that when it happens I'd like to look at your plans for hiding it
> from the user, we would need to make sure we do so in a way that other
> extensions cannot abuse.

How does the Hotfix do it?
(In reply to Anthony Jones (:kentuckyfriedtakahe, :k17e) from comment #31)
> (In reply to Dave Townsend [:mossop] from comment #29)
> > Also note that when it happens I'd like to look at your plans for hiding it
> > from the user, we would need to make sure we do so in a way that other
> > extensions cannot abuse.
> 
> How does the Hotfix do it?

It doesn't. Normally the hotfix just uninstalls itself after it has done it's business. In the rare case that it remains installed it is completely visible to the user.
Comment on attachment 8604438 [details] [diff] [review]
Part 3: Sanity test addon v2

Review of attachment 8604438 [details] [diff] [review]:
-----------------------------------------------------------------

Since your extension isn't meant to be unpacked you need to zip its files up into an XPI during the build. Build config peers should sign off on that part of the patch.

If you auto-update during Firefox runtime should the test run? Does it steal focus?

::: browser/app/profile/extensions/sanity@gfx/Makefile.in
@@ +7,5 @@
> +	install.rdf.in \
> +	bootstrap.js \
> +	chrome.manifest \
> +	$(NULL)
> +FILES_PATH = $(FINAL_TARGET)/extensions/sanity@gfx

This needs to go into distribution/extensions

::: browser/app/profile/extensions/sanity@gfx/bootstrap.js
@@ +133,5 @@
> +  var prefs = Cc["@mozilla.org/preferences-service;1"].getService(Ci.nsIPrefBranch);
> +  var gfxinfo = Cc["@mozilla.org/gfx/info;1"].getService(Ci.nsIGfxInfo);
> +
> +  // TODO: Handle dual GPU setups
> +  if (prefs.getCharPref(DRIVER_PREF) == gfxinfo.adapterDriver &&

Using Preferences.jsm will protect you from the user deleting prefs

@@ +135,5 @@
> +
> +  // TODO: Handle dual GPU setups
> +  if (prefs.getCharPref(DRIVER_PREF) == gfxinfo.adapterDriver &&
> +      prefs.getCharPref(DEVICE_PREF) == gfxinfo.adapterDeviceID &&
> +      prefs.getCharPref(VERSION_PREF) == Services.appinfo.version) {

You said you wanted to be able to update this mid-Firefox release but this will stop the test running again. Use the extension version instead.

::: browser/app/profile/extensions/sanity@gfx/install.rdf.in
@@ +6,5 @@
> +
> +
> +	<RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:em="http://www.mozilla.org/2004/em-rdf#">
> +	  <Description about="urn:mozilla:install-manifest">
> +		<em:id>sanity@gfx</em:id>

Please use a unique ID ending with @mozilla.org

@@ +24,5 @@
> +
> +		<!-- Front End MetaData -->
> +		<em:name>Firefox Gfx Sanity Test</em:name>
> +		<em:description>Does basic testing for firefox graphics features, and disables them if they don't work correctly </em:description>
> +		<em:creator>Mozilla</em:creator>

Is there a plan for localisation here?
Attachment #8604438 - Flags: feedback?(dtownsend) → feedback+
One thing I failed to remember about option 3, when we decide we no longer want to ship this the extension will remain installed for users and we would have to add some special code to remove it.

Overall I'm not sold on the idea of shipping features in this way, users can be surprised by it. Gavin and Chad should make a call on whether this is the right thing. I'm also not a fan of opening a window during startup. Seems like we could detect really bad graphics issue by just looking at the main browser window and then video issues when the user actually goes to a video page.
(In reply to Dave Townsend [:mossop] from comment #33)

> Since your extension isn't meant to be unpacked you need to zip its files up
> into an XPI during the build. Build config peers should sign off on that
> part of the patch.

It seems to work unpacked, what difference does this make?


> 
> If you auto-update during Firefox runtime should the test run? Does it steal
> focus?

It should do, yes, but not steal focus (for now).

> 
> ::: browser/app/profile/extensions/sanity@gfx/Makefile.in
> @@ +7,5 @@
> > +	install.rdf.in \
> > +	bootstrap.js \
> > +	chrome.manifest \
> > +	$(NULL)
> > +FILES_PATH = $(FINAL_TARGET)/extensions/sanity@gfx
> 
> This needs to go into distribution/extensions

That folder doesn't appear to exist. I just copied the path that the other extension in this folder used.


> ::: browser/app/profile/extensions/sanity@gfx/install.rdf.in
> @@ +6,5 @@
> > +
> > +
> > +	<RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:em="http://www.mozilla.org/2004/em-rdf#">
> > +	  <Description about="urn:mozilla:install-manifest">
> > +		<em:id>sanity@gfx</em:id>
> 
> Please use a unique ID ending with @mozilla.org

Ok, do you want a UUID style ID?

> 
> @@ +24,5 @@
> > +
> > +		<!-- Front End MetaData -->
> > +		<em:name>Firefox Gfx Sanity Test</em:name>
> > +		<em:description>Does basic testing for firefox graphics features, and disables them if they don't work correctly </em:description>
> > +		<em:creator>Mozilla</em:creator>
> 
> Is there a plan for localisation here?

I don't have one, no. Do you think we need to do so?


(In reply to Dave Townsend [:mossop] from comment #34)
> Overall I'm not sold on the idea of shipping features in this way, users can
> be surprised by it. Gavin and Chad should make a call on whether this is the
> right thing.

Ok, can we try get this decision made as soon as possible before I invest too much time with the current approach.

> I'm also not a fan of opening a window during startup. Seems
> like we could detect really bad graphics issue by just looking at the main
> browser window and then video issues when the user actually goes to a video
> page.

This is much harder since we have no idea what 'valid' content looks like in the main browser window. With things like aero glass, theming etc the intended content could be literally anything (including pure black, white noise or other content indistinguishable from uninitialized pixels).

Thanks for the feedback!
(In reply to Matt Woodrow (:mattwoodrow) from comment #35)
> (In reply to Dave Townsend [:mossop] from comment #33)
> 
> > Since your extension isn't meant to be unpacked you need to zip its files up
> > into an XPI during the build. Build config peers should sign off on that
> > part of the patch.
> 
> It seems to work unpacked, what difference does this make?

It impacts perf a little. Probably not a big deal but please change install.rdf to indicate that it should be unpacked to match.

> > ::: browser/app/profile/extensions/sanity@gfx/Makefile.in
> > @@ +7,5 @@
> > > +	install.rdf.in \
> > > +	bootstrap.js \
> > > +	chrome.manifest \
> > > +	$(NULL)
> > > +FILES_PATH = $(FINAL_TARGET)/extensions/sanity@gfx
> > 
> > This needs to go into distribution/extensions
> 
> That folder doesn't appear to exist. I just copied the path that the other
> extension in this folder used.

It must go into distribution/extensions if you want to use option 3 mentioned above.

> > ::: browser/app/profile/extensions/sanity@gfx/install.rdf.in
> > @@ +6,5 @@
> > > +
> > > +
> > > +	<RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:em="http://www.mozilla.org/2004/em-rdf#">
> > > +	  <Description about="urn:mozilla:install-manifest">
> > > +		<em:id>sanity@gfx</em:id>
> > 
> > Please use a unique ID ending with @mozilla.org
> 
> Ok, do you want a UUID style ID?

Nope, just sanity-gfx@mozilla.org or something to make it clear it is one of our add-ons.

> > @@ +24,5 @@
> > > +
> > > +		<!-- Front End MetaData -->
> > > +		<em:name>Firefox Gfx Sanity Test</em:name>
> > > +		<em:description>Does basic testing for firefox graphics features, and disables them if they don't work correctly </em:description>
> > > +		<em:creator>Mozilla</em:creator>
> > 
> > Is there a plan for localisation here?
> 
> I don't have one, no. Do you think we need to do so?

Product's call I guess. If this is going to be installed and visible long-term for release users seems likely we'd want to do this.

> > I'm also not a fan of opening a window during startup. Seems
> > like we could detect really bad graphics issue by just looking at the main
> > browser window and then video issues when the user actually goes to a video
> > page.
> 
> This is much harder since we have no idea what 'valid' content looks like in
> the main browser window. With things like aero glass, theming etc the
> intended content could be literally anything (including pure black, white
> noise or other content indistinguishable from uninitialized pixels).
> 
> Thanks for the feedback!

Would rendering problems affect the main browser UI? Because in most cases we can at least guarantee what that should look like
(In reply to Matt Woodrow (:mattwoodrow) from comment #35)
> (In reply to Dave Townsend [:mossop] from comment #33)
> 
> > Since your extension isn't meant to be unpacked you need to zip its files up
> > into an XPI during the build. Build config peers should sign off on that
> > part of the patch.
> 
> It seems to work unpacked, what difference does this make?

On startup, Firefox needs to recursively scan directories of unpacked addons for changes (e.g. if external software is updating the extension). This extra I/O slows down startup. Please don't use an unpacked extension if you can avoid it
Comment on attachment 8604438 [details] [diff] [review]
Part 3: Sanity test addon v2

Review of attachment 8604438 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/app/profile/extensions/sanity@gfx/bootstrap.js
@@ +108,5 @@
> +    reportResult(TEST_FAILED_RENDER);
> +    if (hasHardwareLayers(win)) {
> +      prefs.setBoolPref("layers.acceleration.disabled", true);
> +      Cc['@mozilla.org/toolkit/app-startup;1'].getService(Ci.nsIAppStartup)
> +        .quit(Ci.nsIAppStartup.eForceQuit | Ci.nsIAppStartup.eRestart);

Wil Firefox do a proper shutdown when invoked via nsIAppStartup.quit(eForceQuit)? If not, Telemetry won't get a chance to shutdown and the measurement will never get reported. 
Also, if Firefox has persistent startup crash (as in the last 2 releases), Telemetry won't get sent.
Should I use the XPI in your github to check the interaction with Telemetry? It looks to be 3 weeks old
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #38)
> Comment on attachment 8604438 [details] [diff] [review]
> Part 3: Sanity test addon v2
> 
> Review of attachment 8604438 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: browser/app/profile/extensions/sanity@gfx/bootstrap.js
> @@ +108,5 @@
> > +    reportResult(TEST_FAILED_RENDER);
> > +    if (hasHardwareLayers(win)) {
> > +      prefs.setBoolPref("layers.acceleration.disabled", true);
> > +      Cc['@mozilla.org/toolkit/app-startup;1'].getService(Ci.nsIAppStartup)
> > +        .quit(Ci.nsIAppStartup.eForceQuit | Ci.nsIAppStartup.eRestart);
> 
> Wil Firefox do a proper shutdown when invoked via
> nsIAppStartup.quit(eForceQuit)? If not, Telemetry won't get a chance to
> shutdown and the measurement will never get reported. 
> Also, if Firefox has persistent startup crash (as in the last 2 releases),
> Telemetry won't get sent.
> Should I use the XPI in your github to check the interaction with Telemetry?
> It looks to be 3 weeks old

The only difference between eForceQuit and eAttemptQuit is that if a window can't be closed immediately (a beforeunload event listener for example) then we continue to quit the application anyway. I wouldn't expect it to affect telemetry shutdown.
Glandium, can you please help with me the appropriate moz.build changes to create an xpi file and get into into browser/distribution/extensions within the distribution?

Using XPI_NAME I can get a valid looking .xpi, but it's within xpi-stage, not extensions.

Thanks!
Flags: needinfo?(mh+mozilla)
There is nothing in the build system to do what you're asking for. The closest is INSTALL_EXTENSION_ID, but it will copy the unpacked extension.
Flags: needinfo?(mh+mozilla)
(In reply to Mike Hommey [:glandium] from comment #41)
> There is nothing in the build system to do what you're asking for. The
> closest is INSTALL_EXTENSION_ID, but it will copy the unpacked extension.

Ok, can I add a new similar feature that copies the packed xpi? Or possibly just change the behaviour of INSTALL_EXTENSION_ID since nothing appears to use it.

It also doesn't seem to install into the .app on OSX as far as I can tell.
Flags: needinfo?(mh+mozilla)
(In reply to Matt Woodrow (:mattwoodrow) from comment #42)
> (In reply to Mike Hommey [:glandium] from comment #41)
> > There is nothing in the build system to do what you're asking for. The
> > closest is INSTALL_EXTENSION_ID, but it will copy the unpacked extension.
> 
> Ok, can I add a new similar feature that copies the packed xpi? Or possibly
> just change the behaviour of INSTALL_EXTENSION_ID since nothing appears to
> use it.

INSTALL_EXTENSION_ID is used by comm-central, and iirc, they do use it for addons that need to stay unpacked.

I'd say add a manual rule that copies the xpi where you want it to be. We're not going to uplift the build system changes that would be required to make this work without manual rules.
Flags: needinfo?(mh+mozilla)
(In reply to Dave Townsend [:mossop] from comment #34)
> One thing I failed to remember about option 3, when we decide we no longer
> want to ship this the extension will remain installed for users and we would
> have to add some special code to remove it.
> 
> Overall I'm not sold on the idea of shipping features in this way, users can
> be surprised by it. Gavin and Chad should make a call on whether this is the
> right thing. I'm also not a fan of opening a window during startup. Seems
> like we could detect really bad graphics issue by just looking at the main
> browser window and then video issues when the user actually goes to a video
> page.

The idea with the startup test is to pre-flight our GFx and media playback subsystems to prevent chemspills. The plan is to run a reftest at startup only on the first-run or after a driver update. We need to control the contents of both the test and reference, which is why we briefly show the window. We could change the contents of the test, as I suggest here:
https://github.com/mattwoodrow/firefox-sanity-test/issues/1
Blocks: 772330
(In reply to Jet Villegas (:jet) from comment #44)
> We could change the contents of the test, as I suggest here:
> https://github.com/mattwoodrow/firefox-sanity-test/issues/1

I'd rather go for PM5544 but we should steer clear of copyright issues and just get someone creative to design something pretty.
I have it installing an xpi into distribution/extensions, but when I open the browser it opens an about:newaddon page with 'allow this installation' unchecked by default.

Is there any way to skip that and have it installed by default? Most people aren't going to allow it otherwise.

I also see 'Disabling foreign installed add-on sanity-gfx@mozilla.org in app-profile' logged.
Flags: needinfo?(dtownsend)
(In reply to Matt Woodrow (:mattwoodrow) from comment #46)
> I have it installing an xpi into distribution/extensions, but when I open
> the browser it opens an about:newaddon page with 'allow this installation'
> unchecked by default.
> 
> Is there any way to skip that and have it installed by default? Most people
> aren't going to allow it otherwise.
> 
> I also see 'Disabling foreign installed add-on sanity-gfx@mozilla.org in
> app-profile' logged.

Can you upload your patch so I can test it, that shouldn't be happening.
Comment on attachment 8604438 [details] [diff] [review]
Part 3: Sanity test addon v2

Review of attachment 8604438 [details] [diff] [review]:
-----------------------------------------------------------------

Another thought on this. We need to be careful about setting prefs disabling hardware decoding and then leaving users like that. When we've changed the pref automatically we probably want to re-check that in the future since Firefox could have improved or the user could have installed better graphics drivers. It might make sense to set another pref to say that we changed the decoding pref through this check so we know to do that re-check later.
I thought the idea is to ignore the value of the pref next time a "change event" - version or driver update - come about?
Matt, would this have caught detected the problem in bug 1166394?
Attached patch Part 3: Sanity test addon v3 (obsolete) — Splinter Review
Attachment #8604438 - Attachment is obsolete: true
(In reply to Dave Townsend [:mossop] from comment #47)
> Can you upload your patch so I can test it, that shouldn't be happening.

Sure, done.

(In reply to Dave Townsend [:mossop] from comment #48)

> 
> Another thought on this. We need to be careful about setting prefs disabling
> hardware decoding and then leaving users like that. When we've changed the
> pref automatically we probably want to re-check that in the future since
> Firefox could have improved or the user could have installed better graphics
> drivers. It might make sense to set another pref to say that we changed the
> decoding pref through this check so we know to do that re-check later.

Right, the part 2 patch in this bug adds a new pref 'media.hardware-video-decoding.failed' which is specifically to record that we failed the test, rather than someone disabled the feature manually.

The addon clears the value of this pref (and relies on it taking effect immediately) before running tests so that we can detect if the feature ever starts working again.

We don't have an equivalent pref for hardware accelerated layers because we don't support changing the value after the browser has started, but I'll be looking into a way to solve this in future revisions.

(In reply to Milan Sreckovic [:milan] from comment #50)
> Matt, would this have caught detected the problem in bug 1166394?

Hard to say, given that it was resolution dependent. My test video is 64x64, unsure if that would show the problem or not. I can test it out if you'd like.
(In reply to Matt Woodrow (:mattwoodrow) from comment #46)
> I have it installing an xpi into distribution/extensions, but when I open
> the browser it opens an about:newaddon page with 'allow this installation'
> unchecked by default.
> 
> Is there any way to skip that and have it installed by default? Most people
> aren't going to allow it otherwise.
> 
> I also see 'Disabling foreign installed add-on sanity-gfx@mozilla.org in
> app-profile' logged.

I can't reproduce this problem with your patch. I tried two things. First I ran the build with a clean profile, Firefox started and going to the add-ons manager showed the extension was installed. Then I tried a profile that had previously only had the currently nightly build run in it. Again Firefox started and installed the extension (since you've disabled the application updated check for now). In neither case did I see the sideload UI.

Can you give me a fuller copy of what the extension manager logger shows for the startup?

Also note that I had to effectively disable the test add-on here by putting a return right at the start of startup or my build would crash on first run.
Flags: needinfo?(dtownsend)
Attached patch Part 3: Sanity test addon v4 (obsolete) — Splinter Review
This seems to work nicely on windows :)
Attachment #8607777 - Attachment is obsolete: true
Attachment #8608496 - Flags: review?(dtownsend)
Attached file Crash Stack on OS X (obsolete) —
Just a heads up, while force running this on OS X w/ E10s & a debug build, I can sometimes get a crash during startup. Seems to be occurring roughly around the time we create the Compositor. The important part is:

Thread 50 Crashed:: MediaPl~back #2
0   com.apple.CoreFoundation      	0x00007fff893667d8 CFHash + 296
1   com.apple.CoreFoundation      	0x00007fff89365bc8 CFBasicHashAddValue + 1016
2   com.apple.CoreFoundation      	0x00007fff893a55ce CFDictionaryCreate + 110
3   XUL                           	0x0000000103d6c9ab mozilla::AppleVDADecoder::CreateOutputConfiguration() + 107 (AppleVDADecoder.cpp:467)
4   XUL                           	0x0000000103d6cffe mozilla::AppleVTDecoder::InitializeSession() + 222 (AppleVTDecoder.cpp:297)
5   XUL                           	0x0000000103c6e673 mozilla::MediaFormatReader::EnsureDecodersSetup() + 1347 (MediaFormatReader.cpp:453)
6   XUL                           	0x0000000103c6f680 mozilla::MediaFormatReader::OnDemuxerInitDone(nsresult) + 3440 (MediaFormatReader.cpp:359)
7   XUL                           	0x0000000103c9dff5 mozilla::MediaPromise<nsresult, mozilla::DemuxerFailureReason, true>::MethodThenValue<mozilla::MediaFormatReader, void (mozilla::MediaFormatReader::*)(nsresult), void (mozilla::MediaFormatReader::*)(mozilla::DemuxerFailureReason)>::DoResolveOrRejectInternal(mozilla::MediaPromise<nsresult, mozilla::DemuxerFailureReason, true>::ResolveOrRejectValue&) + 85 (nsRefPtr.h:50)
8   XUL                           	0x0000000103c9dad7 mozilla::MediaPromise<nsresult, mozilla::DemuxerFailureReason, true>::ThenValueBase::ResolveOrRejectRunnable::Run() + 103 (nsRefPtr.h:50)
9   XUL                           	0x0000000103cd9662 mozilla::MediaTaskQueue::Runner::Run() + 354 (MediaTaskQueue.h:135)
10  XUL                           	0x0000000101df73a5 nsThreadPool::Run() + 981 (nsCOMPtr.h:389)
11  XUL                           	0x0000000101df74dd non-virtual thunk to nsThreadPool::Run() + 13 (nsThreadPool.cpp:240)
12  XUL                           	0x0000000101df4380 nsThread::ProcessNextEvent(bool, bool*) + 1456 (nsCOMPtr.h:389)
13  XUL                           	0x0000000101e222de NS_ProcessNextEvent(nsIThread*, bool) + 126 (nsThreadUtils.cpp:265)
14  XUL                           	0x00000001021ab31f mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) + 415 (MessagePump.cpp:355)
15  XUL                           	0x0000000102175afc MessageLoop::Run() + 60 (message_loop.cc:517)
16  XUL                           	0x0000000101df2bdf nsThread::ThreadFunc(void*) + 335 (nsThread.cpp:364)
17  libnss3.dylib                 	0x0000000101b2df19 _pt_root + 281 (ptthread.c:215)
18  libsystem_pthread.dylib       	0x00007fff87b05268 _pthread_body + 131
19  libsystem_pthread.dylib       	0x00007fff87b051e5 _pthread_start + 176
That's bug 1164480 mason, should be fixed tomorrow :)
Depends on: 1164480
Are there plans to make this startup test resilient to gfx startup crashes? Seems like that's one of the things we'd want to achieve with this work
Comment on attachment 8608496 [details] [diff] [review]
Part 3: Sanity test addon v4

Review of attachment 8608496 [details] [diff] [review]:
-----------------------------------------------------------------

This is looking ok to me but a build peer needs to review the build config changes.

::: browser/app/profile/extensions/sanity-gfx@mozilla.org/bootstrap.js
@@ +123,5 @@
> +      Preferences.get(DEVICE_PREF, "") == gfxinfo.adapterDeviceID &&
> +      Preferences.get(VERSION_PREF, "") == addonVersion) {
> +    return;
> +  }
> +

Update these prefs here and then flush the preferences file to disk so if this test causes a crash it doesn't attempt to run again. You could also set a pref to say the test is running and if that pref is set at the start of startup then send telemetry data saying that the test failed to complete.
Attachment #8608496 - Flags: review?(dtownsend) → review+
Attached patch Part 3: Sanity test addon v5 (obsolete) — Splinter Review
Fixed review comments, carrying r=mossop.
Attachment #8609925 - Flags: review+
Comment on attachment 8609925 [details] [diff] [review]
Part 3: Sanity test addon v5

Glandium, can you please review the build changes for this?
Attachment #8609925 - Flags: review?(mh+mozilla)
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #57)
> Are there plans to make this startup test resilient to gfx startup crashes?
> Seems like that's one of the things we'd want to achieve with this work

We certainly want to make sure the test itself doesn't crash when the browser would otherwise not crash, but otherwise, this is more to cover scenarios where we do not crash (and thus don't get the crash report), but are otherwise leaving the user in an equivalently useless situation (e.g., nothing shows up in the browser.)
Comment on attachment 8609925 [details] [diff] [review]
Part 3: Sanity test addon v5

Review of attachment 8609925 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/app/profile/extensions/{972ce4c6-7e08-4474-a285-3208198ce6fd}/moz.build
@@ +3,5 @@
>  # This Source Code Form is subject to the terms of the Mozilla Public
>  # License, v. 2.0. If a copy of the MPL was not distributed with this
>  # file, You can obtain one at http://mozilla.org/MPL/2.0/.
>  
> +DIST_SUBDIR = 'browser'

You /could/ avoid the DIST_SUBDIR reset by not using FINAL_TARGET in Makefile.in, using $(DIST)/bin instead.
Attachment #8609925 - Flags: review?(mh+mozilla) → review+
I should have been clearer when I gave this r+ that that review was predicated on either product or the Firefox owner signing off on this method of shipping this feature. Javaun is looking into this now.

Because of this and because we don't have a way to undo the add-on installing for new nightly users I've backed out the add-on portion of this before it ships in a nightly: https://hg.mozilla.org/integration/mozilla-inbound/rev/5a6a4ced599b
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached image Picture of Bug
On this test machine with the add on only, I get this bug where the top menu bar is colored black and the top right Windows close/minimize buttons do not render. 

STR:
1) build with attachment 8609925 [details] [diff] [review]
2) Run a fresh firefox build with a new profile
3) See bug

All runs of Firefox after the initial run look normal.
Extracts out the red box div from sanitytest.html into it's own separate test. If the readback fails for the red box, it disables layers acceleration and force restarts firefox since layers acceleration isn't a live preference. Based on Part 3 of this bug.
Attachment #8612626 - Flags: review?(matt.woodrow)
(In reply to Mason Chang [:mchang] from comment #68)
> Created attachment 8612626 [details] [diff] [review]
> Part 4: Disable layers acceleration if readback fails
> 
> Extracts out the red box div from sanitytest.html into it's own separate
> test. If the readback fails for the red box, it disables layers acceleration
> and force restarts firefox since layers acceleration isn't a live
> preference. Based on Part 3 of this bug.

I think it's better to keep it as a single html file, that should help minimize the amount of time it takes to run the test.

Without a live pref, we don't have a way to temporarily enable layer acceleration when running the test to determine if the feature has been fixed since the last time we ran. This means that once you fail the layers test once, you'll never get acceleration again (without a manual pref change).

I had something similar to this in earlier versions of the addon, but I removed it until I had a reasonable solution to the pref problem.
Does the addon run early enough that it would be possible to have a semi-live pref? One that can change but only affects new compositor instances?

Also, looks like the addon checks "layers.hardware-video-decoding.enabled" which might have to be "media.hardware-video-decoding.enabled"?
(In reply to David Anderson [:dvander] from comment #70)
> Does the addon run early enough that it would be possible to have a
> semi-live pref? One that can change but only affects new compositor
> instances?

I'd like to see how far we can take this idea...
(In reply to Matt Woodrow (:mattwoodrow) from comment #69)
> (In reply to Mason Chang [:mchang] from comment #68)
> > Created attachment 8612626 [details] [diff] [review]
> > Part 4: Disable layers acceleration if readback fails
> > 
> > Extracts out the red box div from sanitytest.html into it's own separate
> > test. If the readback fails for the red box, it disables layers acceleration
> > and force restarts firefox since layers acceleration isn't a live
> > preference. Based on Part 3 of this bug.
> 
> I think it's better to keep it as a single html file, that should help
> minimize the amount of time it takes to run the test.

The test runs pretty fast already, will one more page load hurt that much?

> 
> Without a live pref, we don't have a way to temporarily enable layer
> acceleration when running the test to determine if the feature has been
> fixed since the last time we ran. This means that once you fail the layers
> test once, you'll never get acceleration again (without a manual pref
> change).
> 
> I had something similar to this in earlier versions of the addon, but I
> removed it until I had a reasonable solution to the pref problem.

Could we change this during the firefox update process to force enable the preference? Then we'd test it once per firefox update, which is every 6 weeks.
(In reply to Mason Chang [:mchang] from comment #72)
> (In reply to Matt Woodrow (:mattwoodrow) from comment #69)
> > (In reply to Mason Chang [:mchang] from comment #68)
> > > Created attachment 8612626 [details] [diff] [review]
> > > Part 4: Disable layers acceleration if readback fails
> > > 
> > > Extracts out the red box div from sanitytest.html into it's own separate
> > > test. If the readback fails for the red box, it disables layers acceleration
> > > and force restarts firefox since layers acceleration isn't a live
> > > preference. Based on Part 3 of this bug.
> > 
> > I think it's better to keep it as a single html file, that should help
> > minimize the amount of time it takes to run the test.
> 
> The test runs pretty fast already, will one more page load hurt that much?
> 

Yes, it could. We want to stay under 500ms for the lifetime of this feature (including future startup tests.) Please be very frugal here.
I should have remembered this sooner but with the new signing requirements coming in this add-on will have to be signed by AMO in order to ship, that means landing the signed manifests in the tree too so will complicate the process of patching this code.

On the up-side I think that might give us a safe way that we can hide add-ons like these. See bug 1169719
    (In reply to Mason Chang [:mchang] from comment #72)
    > ...
    > Could we change this during the firefox update process to force enable the
    > preference? Then we'd test it once per firefox update, which is every 6
    > weeks.

    I misunderstood the original question then.  I assumed the test would be run when we update the version or when we update the driver.  This means keeping the last tested driver version somewhere as well.  For whatever reason when we run this test, we would only set the "acceleration" preference.  We would only check this "acceleration" preference when we're not running the test.

    Did I misunderstand how this is planned?
needinfo no longer needed from Gavin or Sheila.
Flags: needinfo?(smooney)
Flags: needinfo?(gavin.sharp)
Keywords: feature
OS: Mac OS X → Windows
Hardware: x86 → All
Version: 29 Branch → 41 Branch
(In reply to Milan Sreckovic [:milan] from comment #75)
>     (In reply to Mason Chang [:mchang] from comment #72)
>     > ...
>     > Could we change this during the firefox update process to force enable
> the
>     > preference? Then we'd test it once per firefox update, which is every 6
>     > weeks.
> 
>     I misunderstood the original question then.  I assumed the test would be
> run when we update the version or when we update the driver.  This means
> keeping the last tested driver version somewhere as well.  For whatever
> reason when we run this test, we would only set the "acceleration"
> preference.  We would only check this "acceleration" preference when we're
> not running the test.
> 
>     Did I misunderstand how this is planned?

The current test does run whenever we update the version or the driver. However, it only force disables media hardware decoding. It doesn't disable layers acceleration because it isn't a "live" preference. 

I was just wondering if we could force enable layers acceleration on an update if it was previously disabled so that we could test it again.

Maybe another option would be to have a two step process:
1) On a driver/release update, set layers acceleration enabled.
2) On the next Firefox start up, test layers acceleration and disable if the test fails. Then restart firefox. 

Then this add on would run potentially twice :/. I'm going to investigate testing layers during installation instead then.
(In reply to Mason Chang [:mchang] from comment #77)

> 
> The current test does run whenever we update the version or the driver.
> However, it only force disables media hardware decoding. It doesn't disable
> layers acceleration because it isn't a "live" preference. 
> 
> I was just wondering if we could force enable layers acceleration on an
> update if it was previously disabled so that we could test it again.
> 
> Maybe another option would be to have a two step process:
> 1) On a driver/release update, set layers acceleration enabled.
> 2) On the next Firefox start up, test layers acceleration and disable if the
> test fails. Then restart firefox. 
> 
> Then this add on would run potentially twice :/. I'm going to investigate
> testing layers during installation instead then.

That sounds like it would work. I don't think there's any fundamental reason we can't have a live preference for layers though, we've just never needed it in the past. It seems like we'd be better off just maing that work.
To be clear, this is what I'm proposing:
 (1) Make the layers accelerated pref semi-live. It affects windows that haven't gotten layers yet.
 (2) Assume layers accelerated = true.
 (3) If the addon test fails, set layers accelerated = false.
 (4) If the main window hasn't been painted yet, we're good to go.
 (5) If the main window has been painted already, it's probably a black screen, so force a restart of Firefox.

bug 1167477 will have to make the test asynchronous (since we have to wait for the OS to paint the test widget). That's why I'm adding step 5.
We don't even need to restart Firefox for step 5 actually. We can just re-open that widget. But I don't know if that's easier or harder.
Attached patch Remove titlebar=0 from addon (obsolete) — Splinter Review
From comment 66, the way we load the test page with specialPowers and titlebar=0 carries over the titlebar=0 from the main widget causing the bug. Removing the option of titlebar=0 fixes the bug. I'm not sure this is expected behavior or a bug in nsWindow that carries over the titlebar=0 option.

On another side note, if I don't explicitly close the addon window in JS, and load Firefox, then close Firefox, the add on window stays open even after the gecko process closes. That seems odd.
(In reply to David Anderson [:dvander] from comment #79)
>  (2) Assume layers accelerated = true.
>  (3) If the addon test fails, set layers accelerated = false.

If it crashes, then our whole application process is gone, right? So adding that test code could easily destabilize the whole Firefox process?
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #82)
> (In reply to David Anderson [:dvander] from comment #79)
> >  (2) Assume layers accelerated = true.
> >  (3) If the addon test fails, set layers accelerated = false.
> 
> If it crashes, then our whole application process is gone, right? So adding
> that test code could easily destabilize the whole Firefox process?

That is a separate problem. Assuming we solve that in a way that doesn't obsolete what is being done here, comment #79 would still apply. For example see bug 1168935.
Straight port from addon to XUL component. Works pretty well on my test machine.
Comment on attachment 8614470 [details] [diff] [review]
WIP port of addon to XUL component

Review of attachment 8614470 [details] [diff] [review]:
-----------------------------------------------------------------

Quick drive-by:

::: toolkit/components/gfx/GfxSanityTest.manifest
@@ +1,3 @@
> +component {31133b75-8162-4641-a28d-9b30c8f51d10} SanityTest.js process=main
> +contract @mozilla.org/sanity-test;1 {31133b75-8162-4641-a28d-9b30c8f51d10} process=main
> +category app-startup SanityTest service,@mozilla.org/sanity-test;1 process=main

Preferences defined in the profile aren't available at this point in startup (this was probably a problem with the old code too). You should listen for profile-after-change instead.
Attached patch WIP XUL component test (obsolete) — Splinter Review
Working XUL Component with two tests: 1 for layers acceleration and another for hardware video decoding.

Restarting with @mozilla.org/toolkit/app-startup; doesn't actually seem to work this early in the startup process and we still get the toolbar disappearing  bug from attachment 8612479 [details].
Attachment #8614470 - Attachment is obsolete: true
Attached patch WIP XUL component test v3 (obsolete) — Splinter Review
Works with the force restart and verified that if layers is disabled, we can re-enable it and verify that it works again.

(In reply to Dave Townsend [:mossop] from comment #85)
> Comment on attachment 8614470 [details] [diff] [review]
> WIP port of addon to XUL component
> 
> Review of attachment 8614470 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Quick drive-by:
> 
> ::: toolkit/components/gfx/GfxSanityTest.manifest
> @@ +1,3 @@
> > +component {31133b75-8162-4641-a28d-9b30c8f51d10} SanityTest.js process=main
> > +contract @mozilla.org/sanity-test;1 {31133b75-8162-4641-a28d-9b30c8f51d10} process=main
> > +category app-startup SanityTest service,@mozilla.org/sanity-test;1 process=main
> 
> Preferences defined in the profile aren't available at this point in startup
> (this was probably a problem with the old code too). You should listen for
> profile-after-change instead.

Thanks! made it profile-after-change instead.
Attachment #8614933 - Attachment is obsolete: true
This video shows the normal case where everything passes. You can quickly see the slight blip of red on the top left during process startup.
Attachment #8615002 - Attachment description: Video of current UI / interactions with user → Video of current UI / interactions with user - https://vimeo.com/129729164
Comment on attachment 8615001 [details] [diff] [review]
WIP XUL component test v3

Review of attachment 8615001 [details] [diff] [review]:
-----------------------------------------------------------------

Nice, thanks for doing this. Two drive-by comments from the original addon:

::: toolkit/components/gfx/SanityTest.js
@@ +17,5 @@
> +const VERSION_PREF="sanity-test.version";
> +const DISABLE_VIDEO_PREF="media.hardware-video-decoding.failed";
> +const RUNNING_PREF="sanity-test.running";
> +const DISABLE_LAYERS_ACCELERATION_PREF="layers.acceleration.disabled";
> +const HARDWARE_VIDEO_DECODING_PREF="layers.hardware-video-decoding.enabled";

this should be media.hardware-video-decoding.enabled

@@ +179,5 @@
> +      return false;
> +    }
> +
> +    // Update the prefs so that this test doesn't run again until the next update.
> +    Preferences.set(DRIVER_PREF, gfxinfo.adapterDriver);

this should be adapterDriverVersion
Hi Anthony,

Do you still have a machine that could reproduce the youtube showing black failure? IF so, would it please be possible to test this patch locally and verify that this detects the black video?

STR:
1) Apply patch
2) Build and run with a clean profile
3) After running, check about:config
4) The preference "media.hardware-video-decoding.failed" should be true
5) The preference "media.hardware-video-decoding.enabled" should be false.

Thanks!
Attachment #8608496 - Attachment is obsolete: true
Attachment #8609108 - Attachment is obsolete: true
Attachment #8609925 - Attachment is obsolete: true
Attachment #8612626 - Attachment is obsolete: true
Attachment #8613801 - Attachment is obsolete: true
Attachment #8612626 - Flags: review?(matt.woodrow)
Attachment #8615754 - Flags: feedback?(ajones)
(In reply to David Anderson [:dvander] from comment #89)
> Comment on attachment 8615001 [details] [diff] [review]
> WIP XUL component test v3
> 
> Review of attachment 8615001 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Nice, thanks for doing this. Two drive-by comments from the original addon:
> 
> ::: toolkit/components/gfx/SanityTest.js
> @@ +17,5 @@
> > +const VERSION_PREF="sanity-test.version";
> > +const DISABLE_VIDEO_PREF="media.hardware-video-decoding.failed";
> > +const RUNNING_PREF="sanity-test.running";
> > +const DISABLE_LAYERS_ACCELERATION_PREF="layers.acceleration.disabled";
> > +const HARDWARE_VIDEO_DECODING_PREF="layers.hardware-video-decoding.enabled";
> 
> this should be media.hardware-video-decoding.enabled
> 
> @@ +179,5 @@
> > +      return false;
> > +    }
> > +
> > +    // Update the prefs so that this test doesn't run again until the next update.
> > +    Preferences.set(DRIVER_PREF, gfxinfo.adapterDriver);
> 
> this should be adapterDriverVersion

Fixed thanks!
Comment on attachment 8615754 [details] [diff] [review]
WIP patch to verify detection if issues

Review of attachment 8615754 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/gfx/SanityTest.js
@@ +91,5 @@
> +  dump("Testing gfx features\n");
> +
> +  if (!verifyVideoRendering(canvas)) {
> +    reportResult(TEST_FAILED_VIDEO);
> +    Preferences.set(HARDWARE_VIDEO_DECODING_PREF, false);

Why are we setting this pref? Just setting the .failed pref should be sufficient, and that one is live.

We're also not clearing the prefs before running the test, so we'll never re-enable it if it starts working.
Updated with feedback from comment 92.

Anthony, can you please try this patch on the machine that failed hardware video decoding?

STR:
1) Apply patch
2) Build and run with a clean profile
3) After running, check about:config
4) The preference "media.hardware-video-decoding.failed" should be true

Thanks!
Attachment #8615754 - Attachment is obsolete: true
Attachment #8615754 - Flags: feedback?(ajones)
Attachment #8615799 - Flags: feedback?(ajones)
I'm not sure why this happens, but I tracked the toolbars disappearing to [1]. The nsXULWindow syncs attributes to loaded chrome widgets once the chrome code is loaded. I still don't understand why it's the margins that cause the problem rather than the actual DrawsTitleBar (which doesn't actually control the titlebar in terms of the style that's set at [2]). 

I'm jotting down some notes before I forget, but essentially the first chrome window we load has a custom style that dictates the titlebar should not be drawn. This is a change from the XUL Component whereas before, the first chrome window states to just use the OS default window style.

When the Chrome browser is loaded, it calls [3], which finally sets the margins. If I comment out [1], we see the menubars on all new windows. I've already ruled out Windows OS API's such as CreateWindow / RegisterClass as they do not cache actual window style data. 

[1] https://dxr.mozilla.org/mozilla-central/source/xpfe/appshell/nsXULWindow.cpp?from=nsXulWindow.cpp&case=true#1391
[2] https://dxr.mozilla.org/mozilla-central/source/xpfe/appshell/nsAppShellService.cpp?from=nsAppShellService.cpp#570
[3] https://dxr.mozilla.org/mozilla-central/source/xpfe/appshell/nsXULWindow.cpp?from=nsXulWindow.cpp&case=true#973
I'll try it out tomorrow.
This patch fixes the titlebar issue as show in attachment 8612479 [details]. Here's a workaround until we can really understand all the widget code. If we have a chrome page with titlebar=0,popup=1, the sanity test doesn't become the first toplevel window type and doesn't set the titlebar=0 across all new windows. On Windows, we've verified that this is actually still accelerated [1] according to about:support. The downside with this is that on OS X, popups explicitly get basic layers. Since we're only focusing on windows for now, it's probably ok.

[1] https://dxr.mozilla.org/mozilla-central/source/widget/windows/nsWindow.cpp#6591
I'm still not sure what's going on with the widget code. We create multiple nsAppShellServices, nsXULWindows, and nsWebShell windows, each which is suppossed to have it's own chrome flags and UI styling[1]. The first chrome window, which is the sanity test, becomes a toplevel window type [2] when the bug appears, but we again still have another chrome window that also becomes the toplevel window type. In addition, looking at the "chromemargin" attribute from comment 94 doesn't matter as the attribute isn't set by the time the sanity test loads and isn't synced over once chrome loads. The "chromemargin" attribute is loaded once browser.xul loads and has a hard coded margin [3]. I also thought that the "titlebar" style in the chrome flags wasn't being transfered in ApplyChromeFlags[4], which is synchronized between the widget and the nsXulWindow, but that doesn't seem to matter. I also couldn't force the titlebar to appear in JS by manually setting the attribute in the sanity test.

@Neil - Do you have any idea on what else we could look for? The main problem is that if the first chrome page we load is loaded with titlebar=0, all subsequent pages don't have a titlebar as shown in attachment 8612479 [details]. :jimm said you might know the widget code, but if not, any other suggestions on who to ask? Thanks!

[1] https://dxr.mozilla.org/mozilla-central/source/xpfe/appshell/nsXULWindow.h#155
[2] https://dxr.mozilla.org/mozilla-central/source/widget/nsWidgetInitData.h#16
[3] https://dxr.mozilla.org/mozilla-central/source/browser/base/content/browser.xul#47
[4] https://dxr.mozilla.org/mozilla-central/source/xpfe/appshell/nsXULWindow.cpp?from=ApplyChromeFlags&case=true#2026
Flags: needinfo?(neil)
Mason - try is down right now. If you can push a build to try then I'll test it on the problem machine tomorrow.
Flags: needinfo?(mchang)
Part 3: XUL component to test hardware media decoding. Runs as a XUL after profile-after-changed notification. Uses popup=1 to avoid being the first toplevel widget as a fix to avoid the titlebar disappearing until I figure out what's really going on with the first widget.
Attachment #8615001 - Attachment is obsolete: true
Attachment #8615799 - Attachment is obsolete: true
Attachment #8617031 - Attachment is obsolete: true
Attachment #8615799 - Flags: feedback?(ajones)
Flags: needinfo?(mchang)
Attachment #8617442 - Flags: review?(matt.woodrow)
Comment on attachment 8617442 [details] [diff] [review]
Part 3: XUL component to test hardware media decoding

Review of attachment 8617442 [details] [diff] [review]:
-----------------------------------------------------------------

Test pieces look good, need someone else to review the XUL integration though.

::: toolkit/components/gfx/SanityTest.js
@@ +17,5 @@
> +const VERSION_PREF="sanity-test.version";
> +const DISABLE_VIDEO_PREF="media.hardware-video-decoding.failed";
> +const RUNNING_PREF="sanity-test.running";
> +const DISABLE_LAYERS_ACCELERATION_PREF="layers.acceleration.disabled";
> +const HARDWARE_VIDEO_DECODING_PREF="media.hardware-video-decoding.enabled";

These last two prefs aren't needed.

@@ +65,5 @@
> +  return ctx;
> +}
> +
> +// Verify that all the 4 coloured squares of the video
> +// render as expected (with a tolerance of 5 to allow for

tolerance isn't 5 any more.

@@ +121,5 @@
> +      return false;
> +    }
> +
> +    // Update the prefs so that this test doesn't run again until the next update.
> +    Preferences.set(DISABLE_VIDEO_PREF, false);

This pref (and the RUNNING_PREF) don't really match the comment, they're setting state needed to run the test. Maybe just reorder, and add a second comment.

@@ +122,5 @@
> +    }
> +
> +    // Update the prefs so that this test doesn't run again until the next update.
> +    Preferences.set(DISABLE_VIDEO_PREF, false);
> +    Preferences.set(HARDWARE_VIDEO_DECODING_PREF, true);

We shouldn't be changing this pref.

@@ +132,5 @@
> +    return true;
> +  },
> +
> +  observe: function(subject, topic, data) {
> +    if (!this.shouldRunTest()) {

It's not obvious that this function only runs for profile-after-change, and it looks like we could be checking shouldRunTest lots of time. I believe the manifest file stops this from being the case, but it's not obvious here.

Maybe just make it:

if (topic != "profile-after-change") return;

and then get rid of the switch below.
Attachment #8617442 - Flags: review?(matt.woodrow) → review+
Carrying r+, updated with feedback from comment 101.
Attachment #8617442 - Attachment is obsolete: true
Attachment #8617574 - Flags: review+
Comment on attachment 8617574 [details] [diff] [review]
Part 3: XUL component to test hardware media decoding

Hi Dave, I see you're a XUL toolkit module owner. Can you please review the XUL integration pieces? Thanks!
Attachment #8617574 - Flags: review?(dtownsend)
Attachment #8617574 - Attachment is obsolete: true
Attachment #8617574 - Flags: review?(dtownsend)
Attachment #8617577 - Flags: review?(dtownsend)
Comment on attachment 8617577 [details] [diff] [review]
Part 3: XUL component to test hardware media decoding

Review of attachment 8617577 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/gfx/moz.build
@@ +11,5 @@
> +        'GfxSanityTest.manifest',
> +        'SanityTest.js',
> +    ]
> +
> +JAR_MANIFESTS += ['jar.mn']

No point including this if it's not windows right?
Attachment #8617577 - Flags: review?(dtownsend) → review+
(In reply to Dave Townsend [:mossop] from comment #105)
> Comment on attachment 8617577 [details] [diff] [review]
> Part 3: XUL component to test hardware media decoding
> 
> Review of attachment 8617577 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/components/gfx/moz.build
> @@ +11,5 @@
> > +        'GfxSanityTest.manifest',
> > +        'SanityTest.js',
> > +    ]
> > +
> > +JAR_MANIFESTS += ['jar.mn']
> 
> No point including this if it's not windows right?

I get a compile error on non-windows platforms if it isn't included in moz.build. It still detects at a jar file exists but isn't in the moz.build and so it errors out with.

"A jar.mn exists but it is not referenced in the moz.build file. Please define JAR_MANIFESTS."

This still happens even if we have an else block with JAR_MANIFESTS=[""]
With the(In reply to Mason Chang [:mchang] from comment #100)
> Try build with patch -
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=8af6966ea07a

With this build I'm getting media.hardware-video-decoding.failed=false and my test page at http://people.mozilla.org/~ajones/blank-video/ is not showing the rotated colour square.
Hmm, can you please try this build:

https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/try-builds/mchang@mozilla.com-9373d54c9da1/

Also, when you run it again, you have to delete the old profile and start from scratch with a new profile. Can you also attach the debug spew that comes out? You'll have to run firefox from a console, but you'll get lots of spew. Thanks!
Flags: needinfo?(ajones)
Carrying r+, updated to not use a popup due to bug 1173617 and to disable the live pref if the sanity test crashes. 

@David - can you test this plus bug 1173617 and verify things work on your end and that the titlebar issue is gone? Thanks!
Attachment #8617577 - Attachment is obsolete: true
Attachment #8620723 - Flags: review+
Attachment #8620723 - Flags: feedback?(dvander)
Comment on attachment 8620723 [details] [diff] [review]
Part 3: XUL component to test hardware media decoding

Yup, looks good!
Attachment #8620723 - Flags: feedback?(dvander) → feedback+
This is the first I've heard of this, but from the discussion it sounds like it should be heading for uplift to 39.   We go to build for beta 6 on Monday morning. Can you request uplift over the weekend? Thanks!
Flags: needinfo?(matt.woodrow)
Flags: needinfo?(ajones)
Do you guys need any manual testing from the Softvision team for this fix? 

Note that while we have some different GPUs, mostly have up-to-date drivers, and I'm not sure whether we have any black screen issues on any of our machines at this time. Also, very intensive testing might not be feasible given the short time left for Firefox 39.

We can help out with verification if there are some clear scenarios that we can try.
(In reply to Liz Henry (:lizzard) from comment #112)
> This is the first I've heard of this, but from the discussion it sounds like
> it should be heading for uplift to 39.   We go to build for beta 6 on Monday
> morning. Can you request uplift over the weekend? Thanks!

I would prefer to ride the trains with the testing itself,  though the additional telemetry would be useful earlier.
(In reply to Florin Mezei, QA (:FlorinMezei) from comment #113)
> Do you guys need any manual testing from the Softvision team for this fix? 
> 
> Note that while we have some different GPUs, mostly have up-to-date drivers,
> and I'm not sure whether we have any black screen issues on any of our
> machines at this time. Also, very intensive testing might not be feasible
> given the short time left for Firefox 39.
> 
> We can help out with verification if there are some clear scenarios that we
> can try.

Yes, this would be very helpful to test. Some nice things would be to see if you have any machines that have blacklisted drivers. You can verify if you have blacklisted drivers by doing:

1) Going to about:support
2) Going to the graphics section. See if you see any blacklisted GPUs.
3) Report the GPU that is blacklisted

At that point, we could do custom builds with blacklisting disabled and see if we correctly detect the error on that machine.

Another test we would like to see is that on most machines, the test actually runs correctly and everything works. Positive testing would help as well. Maybe we can get a list of machines that are blacklisted first and then we'll try a custom build from there? Is that possible?
Flags: needinfo?(florin.mezei)
Fixed in bug 1173617.
Flags: needinfo?(neil)
(In reply to Liz Henry (:lizzard) from comment #112)
> This is the first I've heard of this, but from the discussion it sounds like
> it should be heading for uplift to 39.   We go to build for beta 6 on Monday
> morning. Can you request uplift over the weekend? Thanks!

The general feeling is that this isn't a good candidate to uplift.
Flags: needinfo?(ajones)
Can you please try the build in comment 110?
Flags: needinfo?(ajones)
Thanks for the info and the good advice!  Marking wontfix for 39.
Flags: needinfo?(matt.woodrow)
Mattwoodrow was able to find a machine and verify that the test correctly detected (crashed in this case) and recovered from the black screen.

Successful try - https://treeherder.mozilla.org/#/jobs?repo=try&revision=8b3f7f93e169
Flags: needinfo?(ajones)
https://hg.mozilla.org/mozilla-central/rev/a53b5cee6b6f
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
(In reply to Mason Chang [:mchang] from comment #116)
> Yes, this would be very helpful to test. Some nice things would be to see if
> you have any machines that have blacklisted drivers. You can verify if you
> have blacklisted drivers by doing:
> 
> 1) Going to about:support
> 2) Going to the graphics section. See if you see any blacklisted GPUs.
> 3) Report the GPU that is blacklisted
> 
> At that point, we could do custom builds with blacklisting disabled and see
> if we correctly detect the error on that machine.
> 
> Another test we would like to see is that on most machines, the test
> actually runs correctly and everything works. Positive testing would help as
> well. Maybe we can get a list of machines that are blacklisted first and
> then we'll try a custom build from there? Is that possible?

As far as I know we have no blacklisted GPUs/drivers, but we'll try to have a look on the machines we have here, as time allows. 

Here's a list of GPUs that we have here, most of them likely running with fairly recent drivers:
- AMD Radeon: 3000, 5450, 6450, 6630M, 7700
- nVIDIA GeForce: 210, 320M, GT 610, 620, 640M
- Intel: Iris Pro 1024 MB, HD 2500, HD 4400

If none of them is blacklisted we should at least get some positive testing done.

I gave it a try today with the latest Nightly 41 build (fresh profiles) on Windows 7 x64. I know for sure my configuration is not blacklisted, and indeed about:support showed that everything was enabled as expected. 

I did not notice anything out of the ordinary at startup (no blip of red on the top left as indicated by the video in comment 88) so I was wondering whether there should be any sort of visible indication that the test is running, or whether we should only see a normal startup and enabled Graphics features in about:support.
Since bug 1161590, you can set preference gfx.blocklist.all to -1, and all blocklisting will be ignored.  This is if you do have blocklisted setup and you want to test as if it wasn't.
Depends on: 1176573
> I would prefer to ride the trains with the testing itself,  though the additional telemetry would be useful earlier.
OK, so, won't fix for 40.
No issues found for over a month here at Softvision, running on machines that are not blacklisted.
Flags: needinfo?(florin.mezei)
Depends on: 1286233
No longer depends on: 1286233
Depends on: 1339432
Depends on: 1360214
See Also: → 1736479
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: